examples : add new sources

ggml-ci
sync : ggml
2025-08-14 02:19:06 +02:00 · 2025-04-02 15:52:29 +03:00 · 2025-04-02 15:52:29 +03:00 · 2025-04-02 15:52:28 +03:00
35 changed files with 281 additions and 2956 deletions
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@ -1,6 +1,6 @@
 cmake_minimum_required(VERSION 3.5) # for add_link_options and implicit target directories.
 project("whisper.cpp" C CXX)
-project("whisper.cpp" VERSION 1.7.5)
+project("whisper.cpp" VERSION 1.7.4)
 include(CheckIncludeFileCXX)

 set(SOVERSION 1)
--- a/README.md
+++ b/README.md
@ -2,12 +2,15 @@

 ![whisper.cpp](https://user-images.githubusercontent.com/1991296/235238348-05d0f6a4-da44-4900-a1de-d0707e75b763.jpeg)

-[![Actions Status](https://github.com/ggml-org/whisper.cpp/workflows/CI/badge.svg)](https://github.com/ggml-org/whisper.cpp/actions)
+[![Actions Status](https://github.com/ggerganov/whisper.cpp/workflows/CI/badge.svg)](https://github.com/ggerganov/whisper.cpp/actions)
 [![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
 [![Conan Center](https://shields.io/conan/v/whisper-cpp)](https://conan.io/center/whisper-cpp)
 [![npm](https://img.shields.io/npm/v/whisper.cpp.svg)](https://www.npmjs.com/package/whisper.cpp/)

-Stable: [v1.7.5](https://github.com/ggml-org/whisper.cpp/releases/tag/v1.7.5) / [Roadmap](https://github.com/orgs/ggml-org/projects/4/)
+> [!NOTE]
+> New maintenance roadmap: https://github.com/ggerganov/whisper.cpp/discussions/2788
+
+Stable: [v1.7.4](https://github.com/ggerganov/whisper.cpp/releases/tag/v1.7.4) / [Roadmap | F.A.Q.](https://github.com/ggerganov/whisper.cpp/discussions/126)

 High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisper) automatic speech recognition (ASR) model:

@ -23,7 +26,7 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
 - [Efficient GPU support for NVIDIA](#nvidia-gpu-support)
 - [OpenVINO Support](#openvino-support)
 - [Ascend NPU Support](#ascend-npu-support)
- [C-style API](https://github.com/ggml-org/whisper.cpp/blob/master/include/whisper.h)
+- [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/include/whisper.h)

 Supported platforms:

@ -31,14 +34,14 @@ Supported platforms:
 - [x] [iOS](examples/whisper.objc)
 - [x] [Android](examples/whisper.android)
 - [x] [Java](bindings/java/README.md)
- [x] Linux / [FreeBSD](https://github.com/ggml-org/whisper.cpp/issues/56#issuecomment-1350920264)
+- [x] Linux / [FreeBSD](https://github.com/ggerganov/whisper.cpp/issues/56#issuecomment-1350920264)
 - [x] [WebAssembly](examples/whisper.wasm)
- [x] Windows ([MSVC](https://github.com/ggml-org/whisper.cpp/blob/master/.github/workflows/build.yml#L117-L144) and [MinGW](https://github.com/ggml-org/whisper.cpp/issues/168)]
- [x] [Raspberry Pi](https://github.com/ggml-org/whisper.cpp/discussions/166)
- [x] [Docker](https://github.com/ggml-org/whisper.cpp/pkgs/container/whisper.cpp)
+- [x] Windows ([MSVC](https://github.com/ggerganov/whisper.cpp/blob/master/.github/workflows/build.yml#L117-L144) and [MinGW](https://github.com/ggerganov/whisper.cpp/issues/168)]
+- [x] [Raspberry Pi](https://github.com/ggerganov/whisper.cpp/discussions/166)
+- [x] [Docker](https://github.com/ggerganov/whisper.cpp/pkgs/container/whisper.cpp)

 The entire high-level implementation of the model is contained in [whisper.h](include/whisper.h) and [whisper.cpp](src/whisper.cpp).
-The rest of the code is part of the [`ggml`](https://github.com/ggml-org/ggml) machine learning library.
+The rest of the code is part of the [`ggml`](https://github.com/ggerganov/ggml) machine learning library.

 Having such a lightweight implementation of the model allows to easily integrate it in different platforms and applications.
 As an example, here is a video of running the model on an iPhone 13 device - fully offline, on-device: [whisper.objc](examples/whisper.objc)
@ -51,14 +54,14 @@ https://user-images.githubusercontent.com/1991296/204038393-2f846eae-c255-4099-a

 On Apple Silicon, the inference runs fully on the GPU via Metal:

-https://github.com/ggml-org/whisper.cpp/assets/1991296/c82e8f86-60dc-49f2-b048-d2fdbd6b5225
+https://github.com/ggerganov/whisper.cpp/assets/1991296/c82e8f86-60dc-49f2-b048-d2fdbd6b5225

 ## Quick start

 First clone the repository:

 ```bash
-git clone https://github.com/ggml-org/whisper.cpp.git
+git clone https://github.com/ggerganov/whisper.cpp.git
 ```

 Navigate into the directory:
@ -149,7 +152,6 @@ standard cmake setup with:
 cmake -B build -DGGML_BLAS=1
 cmake --build build --config Release
 ./build/bin/whisper-cli [ .. etc .. ]
-```

 ## Quantization

@ -223,7 +225,7 @@ speed-up - more than x3 faster compared with CPU-only execution. Here are the in
  The first run on a device is slow, since the ANE service compiles the Core ML model to some device-specific format.
  Next runs are faster.

-For more information about the Core ML implementation please refer to PR [#566](https://github.com/ggml-org/whisper.cpp/pull/566).
+For more information about the Core ML implementation please refer to PR [#566](https://github.com/ggerganov/whisper.cpp/pull/566).

 ## OpenVINO support

@ -308,7 +310,7 @@ This can result in significant speedup in encoder performance. Here are the inst
  The first time run on an OpenVINO device is slow, since the OpenVINO framework will compile the IR (Intermediate Representation) model to a device-specific 'blob'. This device-specific blob will get
  cached for the next run.

-For more information about the OpenVINO implementation please refer to PR [#1037](https://github.com/ggml-org/whisper.cpp/pull/1037).
+For more information about the OpenVINO implementation please refer to PR [#1037](https://github.com/ggerganov/whisper.cpp/pull/1037).

 ## NVIDIA GPU support

@ -386,8 +388,8 @@ Run the inference examples as usual, for example:

 We have two Docker images available for this project:

-1. `ghcr.io/ggml-org/whisper.cpp:main`: This image includes the main executable file as well as `curl` and `ffmpeg`. (platforms: `linux/amd64`, `linux/arm64`)
-2. `ghcr.io/ggml-org/whisper.cpp:main-cuda`: Same as `main` but compiled with CUDA support. (platforms: `linux/amd64`)
+1. `ghcr.io/ggerganov/whisper.cpp:main`: This image includes the main executable file as well as `curl` and `ffmpeg`. (platforms: `linux/amd64`, `linux/arm64`)
+2. `ghcr.io/ggerganov/whisper.cpp:main-cuda`: Same as `main` but compiled with CUDA support. (platforms: `linux/amd64`)

 ### Usage

@ -425,8 +427,8 @@ For detailed instructions on how to use Conan, please refer to the [Conan docume

 This is a naive example of performing real-time inference on audio from your microphone.
 The [stream](examples/stream) tool samples the audio every half a second and runs the transcription continuously.
-More info is available in [issue #10](https://github.com/ggml-org/whisper.cpp/issues/10).
-You will need to have [sdl2](https://wiki.libsdl.org/SDL2/Installation) installed for it to work properly.
+More info is available in [issue #10](https://github.com/ggerganov/whisper.cpp/issues/10). 
+You will need to have [sdl2](https://wiki.libsdl.org/SDL2/Installation) installed for it to work properly. 

 ```bash
 cmake -B build -DWHISPER_SDL2=ON
@ -514,7 +516,7 @@ main: processing './samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 pr

 ## Speaker segmentation via tinydiarize (experimental)

-More information about this approach is available here: https://github.com/ggml-org/whisper.cpp/pull/1058
+More information about this approach is available here: https://github.com/ggerganov/whisper.cpp/pull/1058

 Sample usage:

@ -578,7 +580,7 @@ https://user-images.githubusercontent.com/1991296/199337538-b7b0c7a3-2753-4a88-a

 ## Video comparison of different models

-Use the [scripts/bench-wts.sh](https://github.com/ggml-org/whisper.cpp/blob/master/scripts/bench-wts.sh) script to generate a video in the following format:
+Use the [scripts/bench-wts.sh](https://github.com/ggerganov/whisper.cpp/blob/master/scripts/bench-wts.sh) script to generate a video in the following format:

 ```bash
 ./scripts/bench-wts.sh samples/jfk.wav
@ -595,7 +597,7 @@ In order to have an objective comparison of the performance of the inference acr
 use the [whisper-bench](examples/bench) tool. The tool simply runs the Encoder part of the model and prints how much time it
 took to execute it. The results are summarized in the following Github issue:

-[Benchmark results](https://github.com/ggml-org/whisper.cpp/issues/89)
+[Benchmark results](https://github.com/ggerganov/whisper.cpp/issues/89)

 Additionally a script to run whisper.cpp with different models and audio files is provided [bench.py](scripts/bench.py).

@ -622,24 +624,25 @@ You can download the converted models using the [models/download-ggml-model.sh](
 or manually from here:

 - https://huggingface.co/ggerganov/whisper.cpp
+- https://ggml.ggerganov.com

 For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or [models/README.md](models/README.md).

-## [Bindings](https://github.com/ggml-org/whisper.cpp/discussions/categories/bindings)
+## [Bindings](https://github.com/ggerganov/whisper.cpp/discussions/categories/bindings)

- [x] Rust: [tazz4843/whisper-rs](https://github.com/tazz4843/whisper-rs) | [#310](https://github.com/ggml-org/whisper.cpp/discussions/310)
- [x] JavaScript: [bindings/javascript](bindings/javascript) | [#309](https://github.com/ggml-org/whisper.cpp/discussions/309)
+- [x] Rust: [tazz4843/whisper-rs](https://github.com/tazz4843/whisper-rs) | [#310](https://github.com/ggerganov/whisper.cpp/discussions/310)
+- [x] JavaScript: [bindings/javascript](bindings/javascript) | [#309](https://github.com/ggerganov/whisper.cpp/discussions/309)
  - React Native (iOS / Android): [whisper.rn](https://github.com/mybigday/whisper.rn)
- [x] Go: [bindings/go](bindings/go) | [#312](https://github.com/ggml-org/whisper.cpp/discussions/312)
+- [x] Go: [bindings/go](bindings/go) | [#312](https://github.com/ggerganov/whisper.cpp/discussions/312)
 - [x] Java:
  - [GiviMAD/whisper-jni](https://github.com/GiviMAD/whisper-jni)
- [x] Ruby: [bindings/ruby](bindings/ruby) | [#507](https://github.com/ggml-org/whisper.cpp/discussions/507)
- [x] Objective-C / Swift: [ggml-org/whisper.spm](https://github.com/ggml-org/whisper.spm) | [#313](https://github.com/ggml-org/whisper.cpp/discussions/313)
+- [x] Ruby: [bindings/ruby](bindings/ruby) | [#507](https://github.com/ggerganov/whisper.cpp/discussions/507)
+- [x] Objective-C / Swift: [ggerganov/whisper.spm](https://github.com/ggerganov/whisper.spm) | [#313](https://github.com/ggerganov/whisper.cpp/discussions/313)
  - [exPHAT/SwiftWhisper](https://github.com/exPHAT/SwiftWhisper)
- [x] .NET: | [#422](https://github.com/ggml-org/whisper.cpp/discussions/422)
+- [x] .NET: | [#422](https://github.com/ggerganov/whisper.cpp/discussions/422)
  - [sandrohanea/whisper.net](https://github.com/sandrohanea/whisper.net)
  - [NickDarvey/whisper](https://github.com/NickDarvey/whisper)
- [x] Python: | [#9](https://github.com/ggml-org/whisper.cpp/issues/9)
+- [x] Python: | [#9](https://github.com/ggerganov/whisper.cpp/issues/9)
  - [stlukey/whispercpp.py](https://github.com/stlukey/whispercpp.py) (Cython)
  - [AIWintermuteAI/whispercpp](https://github.com/AIWintermuteAI/whispercpp) (Updated fork of aarnphm/whispercpp)
  - [aarnphm/whispercpp](https://github.com/aarnphm/whispercpp) (Pybind11)
@ -647,33 +650,6 @@ For more details, see the conversion script [models/convert-pt-to-ggml.py](model
 - [x] R: [bnosac/audio.whisper](https://github.com/bnosac/audio.whisper)
 - [x] Unity: [macoron/whisper.unity](https://github.com/Macoron/whisper.unity)

-## XCFramework
-The XCFramework is a precompiled version of the library for iOS, visionOS, tvOS,
-and macOS. It can be used in Swift projects without the need to compile the
-library from source. For examples:
-```swift
-// swift-tools-version: 5.10
-// The swift-tools-version declares the minimum version of Swift required to build this package.
-
-import PackageDescription
-
-let package = Package(
-    name: "Whisper",
-    targets: [
-        .executableTarget(
-            name: "Whisper",
-            dependencies: [
-                "WhisperFramework"
-            ]),
-        .binaryTarget(
-            name: "WhisperFramework",
-            url: "https://github.com/ggml-org/whisper.cpp/releases/download/v1.7.5/whisper-v1.7.5-xcframework.zip",
-            checksum: "c7faeb328620d6012e130f3d705c51a6ea6c995605f2df50f6e1ad68c59c6c4a"
-        )
-    ]
-)
-```
-
 ## Examples

 There are various examples of using the library for different projects in the [examples](examples) folder.
@ -692,13 +668,13 @@ Some of the examples are even ported to run in the browser using WebAssembly. Ch
 | [whisper.android](examples/whisper.android)         |                                       | Android mobile application using whisper.cpp                                                                                    |
 | [whisper.nvim](examples/whisper.nvim)               |                                       | Speech-to-text plugin for Neovim                                                                                                |
 | [generate-karaoke.sh](examples/generate-karaoke.sh) |                                       | Helper script to easily [generate a karaoke video](https://youtu.be/uj7hVta4blM) of raw audio capture                           |
-| [livestream.sh](examples/livestream.sh)             |                                       | [Livestream audio transcription](https://github.com/ggml-org/whisper.cpp/issues/185)                                            |
+| [livestream.sh](examples/livestream.sh)             |                                       | [Livestream audio transcription](https://github.com/ggerganov/whisper.cpp/issues/185)                                           |
 | [yt-wsp.sh](examples/yt-wsp.sh)                     |                                       | Download + transcribe and/or translate any VOD [(original)](https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818) |
 | [wchess](examples/wchess)                           | [wchess.wasm](examples/wchess)        | Voice-controlled chess                                                                                                          |

-## [Discussions](https://github.com/ggml-org/whisper.cpp/discussions)
+## [Discussions](https://github.com/ggerganov/whisper.cpp/discussions)

 If you have any kind of feedback about this project feel free to use the Discussions section and open a new topic.
-You can use the [Show and tell](https://github.com/ggml-org/whisper.cpp/discussions/categories/show-and-tell) category
+You can use the [Show and tell](https://github.com/ggerganov/whisper.cpp/discussions/categories/show-and-tell) category
 to share your own projects that use `whisper.cpp`. If you have a question, make sure to check the
-[Frequently asked questions (#126)](https://github.com/ggml-org/whisper.cpp/discussions/126) discussion.
+[Frequently asked questions (#126)](https://github.com/ggerganov/whisper.cpp/discussions/126) discussion.
--- a/bindings/go/README.md
+++ b/bindings/go/README.md
@ -51,7 +51,7 @@ func main() {
 In order to build, you need to have the Go compiler installed. You can get it from [here](https://golang.org/dl/). Run the tests with:

 ```bash
-git clone https://github.com/ggml-org/whisper.cpp.git
+git clone https://github.com/ggerganov/whisper.cpp.git
 cd whisper.cpp/bindings/go
 make test
 ```
@ -98,7 +98,7 @@ The API Documentation:

 Getting help:

-  * Follow the discussion for the go bindings [here](https://github.com/ggml-org/whisper.cpp/discussions/312)
+  * Follow the discussion for the go bindings [here](https://github.com/ggerganov/whisper.cpp/discussions/312)

 ## License

--- a/bindings/go/doc.go
+++ b/bindings/go/doc.go
@ -1,5 +1,5 @@
 /*
-github.com/ggml-org/whisper.cpp/bindings/go
+github.com/ggerganov/whisper.cpp/bindings/go
 provides a speech-to-text service bindings for the Go programming language.
 */
 package whisper
--- a/bindings/java/README.md
+++ b/bindings/java/README.md
@ -31,10 +31,10 @@ public class Example {
            var whisperParams = whisper.getFullDefaultParams(WhisperSamplingStrategy.WHISPER_SAMPLING_GREEDY);
            // custom configuration if required
            whisperParams.temperature_inc = 0f;
-
+            
            var samples = readAudio(); // divide each value by 32767.0f
            whisper.fullTranscribe(whisperParams, samples);
-
+            
            int segmentCount = whisper.getTextSegmentCount(context);
            for (int i = 0; i < segmentCount; i++) {
                String text = whisper.getTextSegment(context, i);
@ -52,7 +52,7 @@ public class Example {
 In order to build, you need to have the JDK 8 or higher installed. Run the tests with:

 ```bash
-git clone https://github.com/ggml-org/whisper.cpp.git
+git clone https://github.com/ggerganov/whisper.cpp.git
 cd whisper.cpp/bindings/java

 ./gradlew build
--- a/bindings/javascript/package.json
+++ b/bindings/javascript/package.json
@ -1,6 +1,6 @@
 {
  "name": "whisper.cpp",
-  "version": "1.7.5",
+  "version": "1.7.4",
  "description": "Whisper speech recognition",
  "main": "whisper.js",
  "scripts": {
--- a/bindings/ruby/README.md
+++ b/bindings/ruby/README.md
@ -228,7 +228,7 @@ The second argument `samples` may be an array, an object with `length` and `each
 Development
 -----------

-    % git clone https://github.com/ggml-org/whisper.cpp.git
+    % git clone https://github.com/ggerganov/whisper.cpp.git
    % cd whisper.cpp/bindings/ruby
    % rake test

@ -241,5 +241,5 @@ License

 The same to [whisper.cpp][].

-[whisper.cpp]: https://github.com/ggml-org/whisper.cpp
-[models]: https://github.com/ggml-org/whisper.cpp/tree/master/models
+[whisper.cpp]: https://github.com/ggerganov/whisper.cpp
+[models]: https://github.com/ggerganov/whisper.cpp/tree/master/models
--- a/bindings/ruby/ext/ruby_whisper_params.c
+++ b/bindings/ruby/ext/ruby_whisper_params.c
@ -918,7 +918,7 @@ ruby_whisper_params_initialize(int argc, VALUE *argv, VALUE self)
    return self;
  }

-  rb_get_kwargs(kw_hash, param_names, 0, RUBY_WHISPER_PARAMS_PARAM_NAMES_COUNT, values);
+  rb_get_kwargs(kw_hash, &param_names, 0, RUBY_WHISPER_PARAMS_PARAM_NAMES_COUNT, &values);
  Data_Get_Struct(self, ruby_whisper_params, rwp);

  for (i = 0; i < RUBY_WHISPER_PARAMS_PARAM_NAMES_COUNT; i++) {
--- a/bindings/ruby/lib/whisper/model/uri.rb
+++ b/bindings/ruby/lib/whisper/model/uri.rb
@ -34,7 +34,7 @@ module Whisper
               when /darwin/
                 Pathname(Dir.home)/"Library/Caches"
               else
-                 ENV.key?("XDG_CACHE_HOME") ? Pathname(ENV["XDG_CACHE_HOME"]) : Pathname(Dir.home)/".cache"
+                 ENV.key?("XDG_CACHE_HOME") ? ENV["XDG_CACHE_HOME"] : Pathname(Dir.home)/".cache"
               end
        base/"whisper.cpp"
      end
--- a/bindings/ruby/whispercpp.gemspec
+++ b/bindings/ruby/whispercpp.gemspec
@ -26,7 +26,7 @@ Gem::Specification.new do |s|
  s.required_ruby_version = '>= 3.1.0'

  #### Documentation and testing.
-  s.homepage = 'https://github.com/ggml-org/whisper.cpp'
+  s.homepage = 'https://github.com/ggerganov/whisper.cpp'
  s.rdoc_options = ['--main', 'README.md']


--- a/build-xcframework.sh
+++ b/build-xcframework.sh
@ -41,11 +41,6 @@ COMMON_CMAKE_ARGS=(
    -DGGML_OPENMP=${GGML_OPENMP}
 )

-XCODE_VERSION=$(xcodebuild -version 2>/dev/null | head -n1 | awk '{ print $2 }')
-MAJOR_VERSION=$(echo $XCODE_VERSION | cut -d. -f1)
-MINOR_VERSION=$(echo $XCODE_VERSION | cut -d. -f2)
-echo "Detected Xcode version: $XCODE_VERSION"
-
 check_required_tool() {
    local tool=$1
    local install_message=$2
@ -340,28 +335,21 @@ combine_static_libraries() {

    # Platform-specific post-processing for device builds
    if [[ "$is_simulator" == "false" ]]; then
-        if command -v xcrun vtool &>/dev/null; then
+        if command -v vtool &>/dev/null; then
            case "$platform" in
                "ios")
                    echo "Marking binary as a framework binary for iOS..."
-                    xcrun vtool -set-build-version ios ${IOS_MIN_OS_VERSION} ${IOS_MIN_OS_VERSION} -replace \
+                    vtool -set-build-version ios ${IOS_MIN_OS_VERSION} ${IOS_MIN_OS_VERSION} -replace \
                        -output "${base_dir}/${output_lib}" "${base_dir}/${output_lib}"
                    ;;
                "visionos")
                    echo "Marking binary as a framework binary for visionOS..."
-                    if [[ "$MAJOR_VERSION" -gt 16 ]] || [[ "$MAJOR_VERSION" -eq 16 && "$MINOR_VERSION" -gt 2 ]]; then
-                        echo "Xcode version greater than 16.2, using visionOS."
-                        VISION_OS_BUILD_VERSION="visionos"
-                    else
-                        echo "Xcode version less than or equal to 16.2, using xros."
-                        VISION_OS_BUILD_VERSION="xros"
-                    fi
-                    xcrun vtool -set-build-version ${VISION_OS_BUILD_VERSION} ${VISIONOS_MIN_OS_VERSION} ${VISIONOS_MIN_OS_VERSION} -replace \
+                    vtool -set-build-version xros ${VISIONOS_MIN_OS_VERSION} ${VISIONOS_MIN_OS_VERSION} -replace \
                        -output "${base_dir}/${output_lib}" "${base_dir}/${output_lib}"
                    ;;
                "tvos")
                    echo "Marking binary as a framework binary for tvOS..."
-                    xcrun vtool -set-build-version tvos ${TVOS_MIN_OS_VERSION} ${TVOS_MIN_OS_VERSION} -replace \
+                    vtool -set-build-version tvos ${TVOS_MIN_OS_VERSION} ${TVOS_MIN_OS_VERSION} -replace \
                        -output "${base_dir}/${output_lib}" "${base_dir}/${output_lib}"
                    ;;
            esac
--- a/examples/bench/README.md
+++ b/examples/bench/README.md
@ -4,7 +4,7 @@ A very basic tool for benchmarking the inference performance on your device. The
 the transformer on some random audio data and records the execution time. This way we can have an objective comparison
 of the performance of the model for various setups.

-Benchmark results are tracked in the following Github issue: https://github.com/ggml-org/whisper.cpp/issues/89
+Benchmark results are tracked in the following Github issue: https://github.com/ggerganov/whisper.cpp/issues/89

 ```bash
 # run the bench too on the small.en model using 4 threads
@ -40,7 +40,7 @@ system_info: n_threads = 4 | AVX2 = 0 | AVX512 = 0 | NEON = 1 | FP16_VA = 1 | WA

 If you wish, you can submit these results here:

-  https://github.com/ggml-org/whisper.cpp/issues/89
+  https://github.com/ggerganov/whisper.cpp/issues/89

 Please include the following information:

--- a/examples/command/command.cpp
+++ b/examples/command/command.cpp
@ -3,7 +3,7 @@
 // Speak short text commands to the microphone.
 // This program will detect your voice command and convert them to text.
 //
-// ref: https://github.com/ggml-org/whisper.cpp/issues/171
+// ref: https://github.com/ggerganov/whisper.cpp/issues/171
 //

 #include "common-sdl.h"
--- a/examples/livestream.sh
+++ b/examples/livestream.sh
@ -2,7 +2,7 @@
 #
 # Transcribe audio livestream by feeding ffmpeg output to whisper.cpp at regular intervals
 # Idea by @semiformal-net
-# ref: https://github.com/ggml-org/whisper.cpp/issues/185
+# ref: https://github.com/ggerganov/whisper.cpp/issues/185
 #

 set -eo pipefail
--- a/examples/server.py
+++ b/examples/server.py
@ -1,115 +1,39 @@
 import http.server
 import socketserver
 import os
-import sys
 from pathlib import Path
-import urllib.parse

 SCRIPT_DIR = Path(__file__).parent.absolute()
 DIRECTORY = os.path.join(SCRIPT_DIR, "../build-em/bin")
 DIRECTORY = os.path.abspath(DIRECTORY)

-# The context root we want for all applications
-CONTEXT_ROOT = "/whisper.cpp"
-
 class CustomHTTPRequestHandler(http.server.SimpleHTTPRequestHandler):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, directory=DIRECTORY, **kwargs)

    def do_GET(self):
-        # Redirect root to the context root
-        if self.path == '/':
-            self.send_response(302)
-            self.send_header('Location', CONTEXT_ROOT + '/')
-            self.end_headers()
-            return
-
-        # Handle requests under the context root
-        if self.path.startswith(CONTEXT_ROOT):
-            # Remove the context root prefix to get the actual path
-            actual_path = self.path[len(CONTEXT_ROOT):]
-
-            if not actual_path:
-                self.send_response(302)
-                self.send_header('Location', CONTEXT_ROOT + '/')
-                self.end_headers()
-                return
-
-            if '.worker.js' in actual_path:
-                worker_file = os.path.basename(actual_path)
-                worker_path = os.path.join(DIRECTORY, worker_file)
-
-                if os.path.exists(worker_path):
-                    print(f"Found worker file: {worker_path}")
-                    self.path = '/' + worker_file
-                else:
-                    print(f"Worker file not found: {worker_path}")
-
-            elif actual_path == '/':
-                self.path = '/whisper.wasm/index.html'
-            elif actual_path.startswith('/bench.wasm/') or actual_path.startswith('/command.wasm/') or actual_path.startswith('/stream.wasm/'):
-                # Keep the path as is, just remove the context root
-                self.path = actual_path
-            # For all other paths under the context root
-            else:
-                # Check if this is a request to a file in whisper.wasm
-                potential_file = os.path.join(DIRECTORY, 'whisper.wasm', actual_path.lstrip('/'))
-                if os.path.exists(potential_file) and not os.path.isdir(potential_file):
-                    self.path = '/whisper.wasm' + actual_path
-                else:
-                    # Try to resolve the file from the base directory
-                    potential_file = os.path.join(DIRECTORY, actual_path.lstrip('/'))
-                    if os.path.exists(potential_file):
-                        self.path = actual_path
-
-        # For direct requests to worker files (without context root as these
-        # are in the build-em/bin directory
-        elif '.worker.js' in self.path:
+        # If requesting a worker file from any subdirectory
+        if '.worker.js' in self.path:
            worker_file = os.path.basename(self.path)
            worker_path = os.path.join(DIRECTORY, worker_file)

            if os.path.exists(worker_path):
                self.path = '/' + worker_file

-        # Handle coi-serviceworker.js separately
-        if 'coi-serviceworker.js' in self.path:
-            worker_file = "coi-serviceworker.js"
-            worker_path = os.path.join(SCRIPT_DIR, worker_file)
-            if os.path.exists(worker_path):
-                self.send_response(200)
-                self.send_header('Content-type', 'application/javascript')
-                self.end_headers()
-                with open(worker_path, 'rb') as file:
-                    self.wfile.write(file.read())
-                return
-            else:
-                print(f"Warning: Could not find {worker_path}")
-
        return super().do_GET()

    def end_headers(self):
        # Add required headers for SharedArrayBuffer
        self.send_header("Cross-Origin-Opener-Policy", "same-origin")
        self.send_header("Cross-Origin-Embedder-Policy", "require-corp")
-        self.send_header("Access-Control-Allow-Origin", "*")
+        self.send_header("Access-Control-Allow-Origin", "*");
        super().end_headers()

 PORT = 8000

-# Enable address reuse
-class CustomServer(socketserver.TCPServer):
-    allow_reuse_address = True
-
-try:
-    with CustomServer(("", PORT), CustomHTTPRequestHandler) as httpd:
-        print(f"Serving directory '{DIRECTORY}' at http://localhost:{PORT}")
-        print(f"Application context root: http://localhost:{PORT}{CONTEXT_ROOT}/")
-        try:
-            httpd.serve_forever()
-        except KeyboardInterrupt:
-            print("\nServer stopped.")
-            # Force complete exit
-            sys.exit(0)
-except OSError as e:
-    print(f"Error: {e}")
-    sys.exit(1)
+with socketserver.TCPServer(("", PORT), CustomHTTPRequestHandler) as httpd:
+    print(f"Serving directory '{DIRECTORY}' at http://localhost:{PORT}")
+    try:
+        httpd.serve_forever()
+    except KeyboardInterrupt:
+        print("\nServer stopped.")
--- a/examples/twitch.sh
+++ b/examples/twitch.sh
@ -2,7 +2,7 @@
 #
 # Transcribe twitch.tv livestream by feeding audio input to whisper.cpp at regular intervals
 # Thanks to @keyehzy
-# ref: https://github.com/ggml-org/whisper.cpp/issues/209
+# ref: https://github.com/ggerganov/whisper.cpp/issues/209
 #
 # The script currently depends on the third-party tool "streamlink"
 # On Mac OS, you can install it via "brew install streamlink"
--- a/examples/whisper.nvim/whisper.nvim
+++ b/examples/whisper.nvim/whisper.nvim
@ -5,7 +5,7 @@
 # This simple script is called by Neovim to capture audio from the microphone and transcribe it with Whisper.
 # In order for this to work, you need to clone the whisper.cpp repo and build the 'stream' tool
 #
-#   git clone https://github.com/ggml-org/whisper.cpp
+#   git clone https://github.com/ggerganov/whisper.cpp
 #   cd whisper.cpp
 #   make stream
 #
@ -31,7 +31,7 @@
 model="base.en"

 # export the path to the whisper.cpp repo in the WHISPER_CPP_HOME env variable
-# https://github.com/ggml-org/whisper.cpp
+# https://github.com/ggerganov/whisper.cpp
 cd "${WHISPER_CPP_HOME}"

 if [ ! -f ./stream ] ; then
--- a/examples/whisper.wasm/README.md
+++ b/examples/whisper.wasm/README.md
@ -30,7 +30,7 @@ Link: https://ggerganov.github.io/whisper.cpp/

 ```bash (v3.1.2)
 # build using Emscripten
-git clone https://github.com/ggml-org/whisper.cpp
+git clone https://github.com/ggerganov/whisper.cpp
 cd whisper.cpp
 mkdir build-em && cd build-em
 emcmake cmake ..
--- a/examples/whisper.wasm/emscripten.cpp
+++ b/examples/whisper.wasm/emscripten.cpp
@ -65,14 +65,13 @@ EMSCRIPTEN_BINDINGS(whisper) {
        }

        struct whisper_full_params params = whisper_full_default_params(whisper_sampling_strategy::WHISPER_SAMPLING_GREEDY);
-        bool is_multilingual = whisper_is_multilingual(g_contexts[index]);

        params.print_realtime   = true;
        params.print_progress   = false;
        params.print_timestamps = true;
        params.print_special    = false;
        params.translate        = translate;
-        params.language         = is_multilingual ? strdup(lang.c_str()) : "en";
+        params.language         = whisper_is_multilingual(g_contexts[index]) ? lang.c_str() : "en";
        params.n_threads        = std::min(nthreads, std::min(16, mpow2(std::thread::hardware_concurrency())));
        params.offset_ms        = 0;

@ -103,13 +102,10 @@ EMSCRIPTEN_BINDINGS(whisper) {

        // run the worker
        {
-            g_worker = std::thread([index, params, pcmf32 = std::move(pcmf32), is_multilingual]() {
+            g_worker = std::thread([index, params, pcmf32 = std::move(pcmf32)]() {
                whisper_reset_timings(g_contexts[index]);
                whisper_full(g_contexts[index], params, pcmf32.data(), pcmf32.size());
                whisper_print_timings(g_contexts[index]);
-                if (is_multilingual) {
-                    free((void*)params.language);
-                }
            });
        }

--- a/examples/yt-wsp.sh
+++ b/examples/yt-wsp.sh
@ -25,12 +25,12 @@
 # SOFTWARE.

 # Small shell script to more easily automatically download and transcribe live stream VODs.
-# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggml-org/whisper.cpp
+# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggerganov/whisper.cpp
 # Use `./examples/yt-wsp.sh help` to print help info.
 #
 # Sample usage:
 #
-#   git clone https://github.com/ggml-org/whisper.cpp
+#   git clone https://github.com/ggerganov/whisper.cpp
 #   cd whisper.cpp
 #   make
 #   ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890
@ -44,7 +44,7 @@ SCRIPT_DIR="${SCRIPT_PATH%/*}"

 ################################################################################
 # Documentation on downloading models can be found in the whisper.cpp repo:
-# https://github.com/ggml-org/whisper.cpp/#usage
+# https://github.com/ggerganov/whisper.cpp/#usage
 #
 # note: unless a multilingual model is specified, WHISPER_LANG will be ignored
 # and the video will be transcribed as if the audio were in the English language
@ -103,10 +103,10 @@ check_requirements() {
    fi;

    if ! command -v "${WHISPER_EXECUTABLE}" &>/dev/null; then
-        echo "The C++ implementation of Whisper is required: https://github.com/ggml-org/whisper.cpp"
+        echo "The C++ implementation of Whisper is required: https://github.com/ggerganov/whisper.cpp"
        echo "Sample usage:";
        echo "";
-        echo "  git clone https://github.com/ggml-org/whisper.cpp";
+        echo "  git clone https://github.com/ggerganov/whisper.cpp";
        echo "  cd whisper.cpp";
        echo "  make";
        echo "  ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890";
--- a/models/README.md
+++ b/models/README.md
@ -25,6 +25,7 @@ You can now use it like this:
 `ggml` models are available from the following locations:

 - https://huggingface.co/ggerganov/whisper.cpp/tree/main
+- https://ggml.ggerganov.com

 ### 3. Convert with [convert-pt-to-ggml.py](convert-pt-to-ggml.py)

@ -77,7 +78,7 @@ OpenAI format. To read the HF models you can use the [convert-h5-to-ggml.py](con

 ```bash
 git clone https://github.com/openai/whisper
-git clone https://github.com/ggml-org/whisper.cpp
+git clone https://github.com/ggerganov/whisper.cpp

 # clone HF fine-tuned model (this is just an example)
 git clone https://huggingface.co/openai/whisper-medium
@ -95,7 +96,7 @@ Currently, the chunk-based transcription strategy is not implemented, so there c
 ```bash
 # clone OpenAI whisper and whisper.cpp
 git clone https://github.com/openai/whisper
-git clone https://github.com/ggml-org/whisper.cpp
+git clone https://github.com/ggerganov/whisper.cpp

 # get the models
 cd whisper.cpp/models
--- a/models/convert-h5-to-ggml.py
+++ b/models/convert-h5-to-ggml.py
@ -3,7 +3,7 @@
 # Usage:
 #
 #   git clone https://github.com/openai/whisper
-#   git clone https://github.com/ggml-org/whisper.cpp
+#   git clone https://github.com/ggerganov/whisper.cpp
 #   git clone https://huggingface.co/openai/whisper-medium
 #
 #   python3 ./whisper.cpp/models/convert-h5-to-ggml.py ./whisper-medium/ ./whisper .
@ -12,7 +12,7 @@
 #
 # For more info:
 #
-#   https://github.com/ggml-org/whisper.cpp/issues/157
+#   https://github.com/ggerganov/whisper.cpp/issues/157
 #

 import io
--- a/scripts/bench-all-gg.txt
+++ b/scripts/bench-all-gg.txt
@ -1,4 +1,4 @@
-## M1 Pro (old 22c96b4)
+## M1 Pro

 make -j && ./scripts/bench-all.sh 8

@ -67,184 +67,202 @@ make -j && ./scripts/bench-all.sh 8

 Running memcpy benchmark

-memcpy:   48.01 GB/s (heat-up)
-memcpy:   56.00 GB/s ( 1 thread)
-memcpy:   56.20 GB/s ( 1 thread)
-memcpy:  102.69 GB/s ( 2 thread)
-memcpy:  140.32 GB/s ( 3 thread)
-memcpy:  179.04 GB/s ( 4 thread)
-memcpy:  159.61 GB/s ( 5 thread)
-memcpy:  159.02 GB/s ( 6 thread)
-memcpy:  180.29 GB/s ( 7 thread)
-memcpy:  198.10 GB/s ( 8 thread)
-sum:    -5119999345.000000
+memcpy:   46.58 GB/s (heat-up)
+memcpy:   54.16 GB/s ( 1 thread)
+memcpy:   54.23 GB/s ( 1 thread)
+memcpy:   99.63 GB/s ( 2 thread)
+memcpy:  140.59 GB/s ( 3 thread)
+memcpy:  176.52 GB/s ( 4 thread)
+memcpy:  158.90 GB/s ( 5 thread)
+memcpy:  163.00 GB/s ( 6 thread)
+memcpy:  189.69 GB/s ( 7 thread)
+memcpy:  197.15 GB/s ( 8 thread)
+sum:    -5120002007.000000


 make -j && ./scripts/bench-all.sh 1

 Running ggml_mul_mat benchmark with 1 threads

-  64 x   64: Q4_0    37.7 GFLOPS (128 runs) | Q4_1    36.0 GFLOPS (128 runs)
-  64 x   64: Q5_0    20.1 GFLOPS (128 runs) | Q5_1    19.8 GFLOPS (128 runs) | Q8_0    39.5 GFLOPS (128 runs)
-  64 x   64: F16     29.9 GFLOPS (128 runs) | F32     22.6 GFLOPS (128 runs)
- 128 x  128: Q4_0    71.0 GFLOPS (128 runs) | Q4_1    62.2 GFLOPS (128 runs)
- 128 x  128: Q5_0    33.4 GFLOPS (128 runs) | Q5_1    31.6 GFLOPS (128 runs) | Q8_0    79.8 GFLOPS (128 runs)
- 128 x  128: F16     52.4 GFLOPS (128 runs) | F32     32.7 GFLOPS (128 runs)
- 256 x  256: Q4_0    88.6 GFLOPS (128 runs) | Q4_1    77.2 GFLOPS (128 runs)
- 256 x  256: Q5_0    40.3 GFLOPS (128 runs) | Q5_1    36.8 GFLOPS (128 runs) | Q8_0   102.5 GFLOPS (128 runs)
- 256 x  256: F16     64.6 GFLOPS (128 runs) | F32     36.4 GFLOPS (128 runs)
- 512 x  512: Q4_0    94.7 GFLOPS (128 runs) | Q4_1    83.6 GFLOPS (128 runs)
- 512 x  512: Q5_0    45.9 GFLOPS (128 runs) | Q5_1    41.3 GFLOPS (128 runs) | Q8_0   112.8 GFLOPS (128 runs)
- 512 x  512: F16     72.3 GFLOPS (128 runs) | F32     37.7 GFLOPS (128 runs)
-1024 x 1024: Q4_0    98.9 GFLOPS ( 47 runs) | Q4_1    88.2 GFLOPS ( 42 runs)
-1024 x 1024: Q5_0    49.0 GFLOPS ( 23 runs) | Q5_1    43.9 GFLOPS ( 21 runs) | Q8_0   121.0 GFLOPS ( 57 runs)
-1024 x 1024: F16     72.6 GFLOPS ( 34 runs) | F32     36.0 GFLOPS ( 17 runs)
-2048 x 2048: Q4_0   101.3 GFLOPS (  6 runs) | Q4_1    90.0 GFLOPS (  6 runs)
-2048 x 2048: Q5_0    50.8 GFLOPS (  3 runs) | Q5_1    45.3 GFLOPS (  3 runs) | Q8_0   124.1 GFLOPS (  8 runs)
-2048 x 2048: F16     70.7 GFLOPS (  5 runs) | F32     30.4 GFLOPS (  3 runs)
-4096 x 4096: Q4_0   101.7 GFLOPS (  3 runs) | Q4_1    90.3 GFLOPS (  3 runs)
-4096 x 4096: Q5_0    52.2 GFLOPS (  3 runs) | Q5_1    45.7 GFLOPS (  3 runs) | Q8_0   123.0 GFLOPS (  3 runs)
-4096 x 4096: F16     60.3 GFLOPS (  3 runs) | F32     29.8 GFLOPS (  3 runs)
+  64 x   64: Q4_0   245.8 GFLOPS (128 runs) | Q4_1   168.6 GFLOPS (128 runs)
+  64 x   64: Q5_0   115.7 GFLOPS (128 runs) | Q5_1   125.9 GFLOPS (128 runs) | Q8_0   215.8 GFLOPS (128 runs)
+  64 x   64: F16    139.5 GFLOPS (128 runs) | F32    337.2 GFLOPS (128 runs)
+ 128 x  128: Q4_0   494.8 GFLOPS (128 runs) | Q4_1   350.4 GFLOPS (128 runs)
+ 128 x  128: Q5_0   257.1 GFLOPS (128 runs) | Q5_1   261.4 GFLOPS (128 runs) | Q8_0   509.4 GFLOPS (128 runs)
+ 128 x  128: F16    302.3 GFLOPS (128 runs) | F32    672.8 GFLOPS (128 runs)
+ 256 x  256: Q4_0   795.7 GFLOPS (128 runs) | Q4_1   663.7 GFLOPS (128 runs)
+ 256 x  256: Q5_0   737.8 GFLOPS (128 runs) | Q5_1   757.6 GFLOPS (128 runs) | Q8_0   827.7 GFLOPS (128 runs)
+ 256 x  256: F16    872.6 GFLOPS (128 runs) | F32    956.3 GFLOPS (128 runs)
+ 512 x  512: Q4_0  1188.0 GFLOPS (128 runs) | Q4_1  1085.0 GFLOPS (128 runs)
+ 512 x  512: Q5_0  1421.1 GFLOPS (128 runs) | Q5_1  1454.9 GFLOPS (128 runs) | Q8_0  1191.4 GFLOPS (128 runs)
+ 512 x  512: F16   1577.4 GFLOPS (128 runs) | F32   1982.0 GFLOPS (128 runs)
+1024 x 1024: Q4_0  2342.6 GFLOPS (128 runs) | Q4_1  1955.8 GFLOPS (128 runs)
+1024 x 1024: Q5_0  2306.7 GFLOPS (128 runs) | Q5_1  2217.0 GFLOPS (128 runs) | Q8_0  2230.7 GFLOPS (128 runs)
+1024 x 1024: F16   2593.8 GFLOPS (128 runs) | F32   3269.0 GFLOPS (128 runs)
+2048 x 2048: Q4_0  3735.7 GFLOPS (128 runs) | Q4_1  3205.3 GFLOPS (128 runs)
+2048 x 2048: Q5_0  3584.5 GFLOPS (128 runs) | Q5_1  3621.7 GFLOPS (128 runs) | Q8_0  3622.3 GFLOPS (128 runs)
+2048 x 2048: F16   3763.6 GFLOPS (128 runs) | F32   4153.3 GFLOPS (128 runs)
+4096 x 4096: Q4_0  3891.1 GFLOPS ( 29 runs) | Q4_1  3554.0 GFLOPS ( 26 runs)
+4096 x 4096: Q5_0  3753.1 GFLOPS ( 28 runs) | Q5_1  3750.1 GFLOPS ( 28 runs) | Q8_0  3768.5 GFLOPS ( 28 runs)
+4096 x 4096: F16   3864.2 GFLOPS ( 29 runs) | F32   3970.5 GFLOPS ( 29 runs)


 make -j && ./scripts/bench-all.sh 1 1 0

 |      CPU | Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
 |      --- |    --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| M2 ULTRA |  METAL |          tiny |   1 |   0 |    8.74 |    1.20 |    0.36 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q5_0 |   1 |   0 |   10.30 |    1.15 |    0.38 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q5_1 |   1 |   0 |   10.71 |    1.13 |    0.38 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q8_0 |   1 |   0 |    9.97 |    1.12 |    0.37 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |          base |   1 |   0 |   16.77 |    1.71 |    0.44 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q5_0 |   1 |   0 |   16.92 |    1.63 |    0.44 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q5_1 |   1 |   0 |   16.84 |    1.63 |    0.44 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q8_0 |   1 |   0 |   16.12 |    1.63 |    0.44 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |         small |   1 |   0 |   45.29 |    3.44 |    0.92 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q5_0 |   1 |   0 |   50.43 |    3.34 |    0.94 |    0.06 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q5_1 |   1 |   0 |   50.49 |    3.35 |    0.93 |    0.06 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q8_0 |   1 |   0 |   47.37 |    3.20 |    0.91 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |        medium |   1 |   0 |  122.81 |    7.39 |    1.99 |    0.12 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q5_0 |   1 |   0 |  140.62 |    6.73 |    2.03 |    0.14 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q5_1 |   1 |   0 |  140.44 |    6.74 |    2.04 |    0.14 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q8_0 |   1 |   0 |  131.05 |    6.54 |    1.95 |    0.13 | ad4e350 |
-| M2 ULTRA |  METAL |    medium-dis |   1 |   0 |  110.95 |    0.99 |    0.24 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |      large-v2 |   1 |   0 |  222.19 |   10.93 |    3.01 |    0.21 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q5_0 |   1 |   0 |  258.47 |    9.75 |    3.01 |    0.25 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q5_1 |   1 |   0 |  258.40 |    9.85 |    3.01 |    0.24 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q8_0 |   1 |   0 |  236.68 |    9.61 |    2.85 |    0.23 | ad4e350 |
-| M2 ULTRA |  METAL |  large-v2-dis |   1 |   0 |  199.28 |    1.12 |    0.27 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo |   1 |   0 |  201.49 |    1.76 |    0.45 |    0.03 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo-q5_0 |   1 |   0 |  233.70 |    1.55 |    0.46 |    0.04 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo-q8_0 |   1 |   0 |  214.20 |    1.51 |    0.44 |    0.04 | ad4e350 |
+| M2 ULTRA |  METAL |          tiny |   1 |   0 |   12.32 |    1.35 |    0.49 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |     tiny-q5_0 |   1 |   0 |   11.65 |    1.30 |    0.51 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |     tiny-q5_1 |   1 |   0 |   12.08 |    1.30 |    0.51 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |          base |   1 |   0 |   17.58 |    1.90 |    0.76 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |     base-q5_0 |   1 |   0 |   18.89 |    1.86 |    0.79 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |     base-q5_1 |   1 |   0 |   20.69 |    1.88 |    0.79 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |         small |   1 |   0 |   49.32 |    3.85 |    1.71 |    0.05 | 22c96b4 |
+| M2 ULTRA |  METAL |    small-q5_0 |   1 |   0 |   54.91 |    3.81 |    1.82 |    0.06 | 22c96b4 |
+| M2 ULTRA |  METAL |    small-q5_1 |   1 |   0 |   54.92 |    3.81 |    1.79 |    0.06 | 22c96b4 |
+| M2 ULTRA |  METAL |        medium |   1 |   0 |  134.34 |    8.04 |    3.82 |    0.13 | 22c96b4 |
+| M2 ULTRA |  METAL |   medium-q5_0 |   1 |   0 |  151.68 |    7.59 |    4.07 |    0.14 | 22c96b4 |
+| M2 ULTRA |  METAL |   medium-q5_1 |   1 |   0 |  151.58 |    7.67 |    4.07 |    0.14 | 22c96b4 |
+| M2 ULTRA |  METAL |    medium-dis |   1 |   0 |  120.82 |    1.07 |    0.41 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |      large-v2 |   1 |   0 |  235.63 |   12.27 |    5.85 |    0.22 | 22c96b4 |
+| M2 ULTRA |  METAL | large-v2-q5_0 |   1 |   0 |  273.38 |   11.17 |    6.40 |    0.26 | 22c96b4 |
+| M2 ULTRA |  METAL | large-v2-q5_1 |   1 |   0 |  272.44 |   11.32 |    6.29 |    0.26 | 22c96b4 |
+| M2 ULTRA |  METAL |  large-v2-dis |   1 |   0 |  212.51 |    1.20 |    0.47 |    0.02 | 22c96b4 |


 make -j && ./scripts/bench-all.sh 1 1 1

 |      CPU | Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
 |      --- |    --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| M2 ULTRA |  METAL |          tiny |   1 |   1 |    7.82 |    1.31 |    0.35 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q5_0 |   1 |   1 |    8.32 |    1.28 |    0.37 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q5_1 |   1 |   1 |    8.21 |    1.28 |    0.37 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |     tiny-q8_0 |   1 |   1 |    7.97 |    1.23 |    0.36 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |          base |   1 |   1 |   13.96 |    1.80 |    0.42 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q5_0 |   1 |   1 |   15.19 |    1.75 |    0.42 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q5_1 |   1 |   1 |   15.09 |    1.75 |    0.42 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |     base-q8_0 |   1 |   1 |   14.45 |    1.70 |    0.41 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL |         small |   1 |   1 |   40.08 |    3.54 |    0.86 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q5_0 |   1 |   1 |   45.07 |    3.51 |    0.88 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q5_1 |   1 |   1 |   45.05 |    3.52 |    0.88 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |    small-q8_0 |   1 |   1 |   42.04 |    3.34 |    0.85 |    0.05 | ad4e350 |
-| M2 ULTRA |  METAL |        medium |   1 |   1 |  107.20 |    7.28 |    1.79 |    0.11 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q5_0 |   1 |   1 |  125.02 |    6.67 |    1.83 |    0.12 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q5_1 |   1 |   1 |  124.83 |    6.70 |    1.84 |    0.12 | ad4e350 |
-| M2 ULTRA |  METAL |   medium-q8_0 |   1 |   1 |  114.56 |    6.53 |    1.79 |    0.11 | ad4e350 |
-| M2 ULTRA |  METAL |    medium-dis |   1 |   1 |   95.96 |    1.01 |    0.23 |    0.01 | ad4e350 |
-| M2 ULTRA |  METAL |      large-v2 |   1 |   1 |  194.29 |   10.57 |    2.67 |    0.20 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q5_0 |   1 |   1 |  230.74 |    9.57 |    2.73 |    0.23 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q5_1 |   1 |   1 |  229.97 |    9.69 |    2.74 |    0.23 | ad4e350 |
-| M2 ULTRA |  METAL | large-v2-q8_0 |   1 |   1 |  208.11 |    9.37 |    2.60 |    0.21 | ad4e350 |
-| M2 ULTRA |  METAL |  large-v2-dis |   1 |   1 |  172.72 |    1.12 |    0.26 |    0.02 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo |   1 |   1 |  174.46 |    1.74 |    0.42 |    0.03 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo-q5_0 |   1 |   1 |  205.78 |    1.54 |    0.42 |    0.04 | ad4e350 |
-| M2 ULTRA |  METAL | large-v3-turbo-q8_0 |   1 |   1 |  186.33 |    1.50 |    0.40 |    0.03 | ad4e350 |
+| M2 ULTRA |  METAL |          tiny |   1 |   1 |    9.07 |    1.33 |    0.45 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |     tiny-q5_0 |   1 |   1 |    9.74 |    1.33 |    0.47 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |     tiny-q5_1 |   1 |   1 |    8.93 |    1.31 |    0.46 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |          base |   1 |   1 |   15.75 |    1.87 |    0.71 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |     base-q5_0 |   1 |   1 |   17.04 |    1.83 |    0.74 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |     base-q5_1 |   1 |   1 |   17.17 |    1.83 |    0.74 |    0.02 | 22c96b4 |
+| M2 ULTRA |  METAL |         small |   1 |   1 |   42.33 |    3.64 |    1.60 |    0.05 | 22c96b4 |
+| M2 ULTRA |  METAL |    small-q5_0 |   1 |   1 |   47.61 |    3.63 |    1.70 |    0.05 | 22c96b4 |
+| M2 ULTRA |  METAL |    small-q5_1 |   1 |   1 |   47.70 |    3.66 |    1.68 |    0.05 | 22c96b4 |
+| M2 ULTRA |  METAL |        medium |   1 |   1 |  114.42 |    7.53 |    3.55 |    0.11 | 22c96b4 |
+| M2 ULTRA |  METAL |   medium-q5_0 |   1 |   1 |  132.63 |    7.02 |    3.77 |    0.13 | 22c96b4 |
+| M2 ULTRA |  METAL |   medium-q5_1 |   1 |   1 |  132.28 |    7.10 |    3.76 |    0.13 | 22c96b4 |
+| M2 ULTRA |  METAL |    medium-dis |   1 |   1 |  102.34 |    1.01 |    0.42 |    0.01 | 22c96b4 |
+| M2 ULTRA |  METAL |      large-v2 |   1 |   1 |  203.01 |   11.03 |    5.45 |    0.20 | 22c96b4 |
+| M2 ULTRA |  METAL | large-v2-q5_0 |   1 |   1 |  240.05 |   10.18 |    5.98 |    0.23 | 22c96b4 |
+| M2 ULTRA |  METAL | large-v2-q5_1 |   1 |   1 |  239.22 |   10.23 |    5.87 |    0.23 | 22c96b4 |
+| M2 ULTRA |  METAL |  large-v2-dis |   1 |   1 |  181.14 |    1.14 |    0.48 |    0.02 | 22c96b4 |


-## M4 Max

-make -j && ./scripts/bench-all.sh 8
+## Ryzen 9 5950X + RTX 2060
+
+make -j && ./scripts/bench-all.sh 8 0 0

 Running memcpy benchmark

-memcpy:   57.23 GB/s (heat-up)
-memcpy:   68.85 GB/s ( 1 thread)
-memcpy:   70.00 GB/s ( 1 thread)
-memcpy:  104.83 GB/s ( 2 thread)
-memcpy:  124.54 GB/s ( 3 thread)
-memcpy:  144.30 GB/s ( 4 thread)
-memcpy:  141.24 GB/s ( 5 thread)
-memcpy:  147.03 GB/s ( 6 thread)
-memcpy:  147.18 GB/s ( 7 thread)
-memcpy:  149.83 GB/s ( 8 thread)
-sum:    -5120001475.000000
+memcpy:   12.36 GB/s (heat-up)
+memcpy:   12.33 GB/s ( 1 thread)
+memcpy:   12.38 GB/s ( 1 thread)
+memcpy:   14.48 GB/s ( 2 thread)
+memcpy:   15.00 GB/s ( 3 thread)
+memcpy:   14.77 GB/s ( 4 thread)
+memcpy:   14.60 GB/s ( 5 thread)
+memcpy:   14.57 GB/s ( 6 thread)
+memcpy:   14.34 GB/s ( 7 thread)
+memcpy:   14.40 GB/s ( 8 thread)
+sum:    -5119998076.000000
+
+Running ggml_mul_mat benchmark with 8 threads
+
+  64 x   64: Q4_0     3.1 GFLOPS (128 runs) | Q4_1     3.1 GFLOPS (128 runs)
+  64 x   64: Q5_0     3.0 GFLOPS (128 runs) | Q5_1     2.9 GFLOPS (128 runs) | Q8_0     3.1 GFLOPS (128 runs)
+  64 x   64: F16      3.0 GFLOPS (128 runs) | F32      3.0 GFLOPS (128 runs)
+ 128 x  128: Q4_0    21.1 GFLOPS (128 runs) | Q4_1    20.3 GFLOPS (128 runs)
+ 128 x  128: Q5_0    20.6 GFLOPS (128 runs) | Q5_1    20.4 GFLOPS (128 runs) | Q8_0    22.1 GFLOPS (128 runs)
+ 128 x  128: F16     21.7 GFLOPS (128 runs) | F32     21.7 GFLOPS (128 runs)
+ 256 x  256: Q4_0   105.7 GFLOPS (128 runs) | Q4_1    94.4 GFLOPS (128 runs)
+ 256 x  256: Q5_0    94.8 GFLOPS (128 runs) | Q5_1    87.5 GFLOPS (128 runs) | Q8_0   107.2 GFLOPS (128 runs)
+ 256 x  256: F16     95.1 GFLOPS (128 runs) | F32     94.3 GFLOPS (128 runs)
+ 512 x  512: Q4_0   214.7 GFLOPS (128 runs) | Q4_1   189.8 GFLOPS (128 runs)
+ 512 x  512: Q5_0   187.7 GFLOPS (128 runs) | Q5_1   176.2 GFLOPS (128 runs) | Q8_0   252.2 GFLOPS (128 runs)
+ 512 x  512: F16    220.8 GFLOPS (128 runs) | F32    218.3 GFLOPS (128 runs)
+1024 x 1024: Q4_0   333.7 GFLOPS (128 runs) | Q4_1   305.8 GFLOPS (128 runs)
+1024 x 1024: Q5_0   283.2 GFLOPS (128 runs) | Q5_1   268.2 GFLOPS (125 runs) | Q8_0   394.1 GFLOPS (128 runs)
+1024 x 1024: F16    355.0 GFLOPS (128 runs) | F32    313.0 GFLOPS (128 runs)
+2048 x 2048: Q4_0   395.0 GFLOPS ( 23 runs) | Q4_1   380.6 GFLOPS ( 23 runs)
+2048 x 2048: Q5_0   336.6 GFLOPS ( 20 runs) | Q5_1   318.4 GFLOPS ( 19 runs) | Q8_0   482.6 GFLOPS ( 29 runs)
+2048 x 2048: F16    424.5 GFLOPS ( 25 runs) | F32    337.7 GFLOPS ( 20 runs)
+4096 x 4096: Q4_0   412.8 GFLOPS (  4 runs) | Q4_1   405.1 GFLOPS (  3 runs)
+4096 x 4096: Q5_0   346.0 GFLOPS (  3 runs) | Q5_1   334.6 GFLOPS (  3 runs) | Q8_0   502.6 GFLOPS (  4 runs)
+4096 x 4096: F16    412.5 GFLOPS (  4 runs) | F32    274.0 GFLOPS (  3 runs)
+
+|           CPU | Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
+|           --- |    --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
+| Ryzen 9 5950X |   AVX2 |          tiny |   8 |   0 |  195.29 |    1.57 |    0.51 |    0.26 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |     tiny-q5_0 |   8 |   0 |  213.33 |    1.10 |    0.50 |    0.30 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |     tiny-q5_1 |   8 |   0 |  219.38 |    1.18 |    0.53 |    0.32 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |          base |   8 |   0 |  424.85 |    3.71 |    1.03 |    0.46 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |     base-q5_0 |   8 |   0 |  473.61 |    1.81 |    0.82 |    0.52 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |     base-q5_1 |   8 |   0 |  484.14 |    1.92 |    0.85 |    0.56 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |         small |   8 |   0 | 1458.32 |   12.66 |    3.09 |    1.26 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |    small-q5_0 |   8 |   0 | 1673.22 |    6.42 |    2.18 |    1.45 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |    small-q5_1 |   8 |   0 | 1724.78 |    6.72 |    2.32 |    1.52 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |        medium |   8 |   0 | 4333.87 |   36.80 |    8.56 |    3.37 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |   medium-q5_0 |   8 |   0 | 5194.09 |   19.21 |    5.71 |    3.97 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |   medium-q5_1 |   8 |   0 | 5450.39 |   20.01 |    5.99 |    4.17 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |    medium-dis |   8 |   0 | 3995.19 |    5.08 |    1.21 |    0.55 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |      large-v2 |   8 |   0 | 8056.16 |   69.74 |   16.11 |    6.13 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 | large-v2-q5_0 |   8 |   0 | 9799.58 |   35.16 |   10.49 |    7.28 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 | large-v2-q5_1 |   8 |   0 |      ms |   36.74 |   11.02 |    7.65 | 22c96b4 |
+| Ryzen 9 5950X |   AVX2 |  large-v2-dis |   8 |   0 | 7490.03 |    7.40 |    1.70 |    0.72 | 22c96b4 |


-make -j && ./scripts/bench-all.sh 1
+WHISPER_CUDA=1 make -j && ./scripts/bench-all.sh 8 1 0

-Running ggml_mul_mat benchmark with 1 threads
-
-  64 x   64: Q4_0    49.6 GFLOPS (128 runs) | Q4_1    46.8 GFLOPS (128 runs)
-  64 x   64: Q5_0    28.1 GFLOPS (128 runs) | Q5_1    26.8 GFLOPS (128 runs) | Q8_0    52.3 GFLOPS (128 runs)
-  64 x   64: F16     38.1 GFLOPS (128 runs) | F32     26.0 GFLOPS (128 runs)
- 128 x  128: Q4_0    87.6 GFLOPS (128 runs) | Q4_1    79.9 GFLOPS (128 runs)
- 128 x  128: Q5_0    44.7 GFLOPS (128 runs) | Q5_1    41.6 GFLOPS (128 runs) | Q8_0    98.9 GFLOPS (128 runs)
- 128 x  128: F16     64.1 GFLOPS (128 runs) | F32     35.4 GFLOPS (128 runs)
- 256 x  256: Q4_0   104.2 GFLOPS (128 runs) | Q4_1    92.3 GFLOPS (128 runs)
- 256 x  256: Q5_0    57.3 GFLOPS (128 runs) | Q5_1    51.5 GFLOPS (128 runs) | Q8_0   127.7 GFLOPS (128 runs)
- 256 x  256: F16     71.4 GFLOPS (128 runs) | F32     40.6 GFLOPS (128 runs)
- 512 x  512: Q4_0   109.5 GFLOPS (128 runs) | Q4_1    98.0 GFLOPS (128 runs)
- 512 x  512: Q5_0    62.4 GFLOPS (128 runs) | Q5_1    54.6 GFLOPS (128 runs) | Q8_0   135.0 GFLOPS (128 runs)
- 512 x  512: F16     82.6 GFLOPS (128 runs) | F32     44.6 GFLOPS (128 runs)
-1024 x 1024: Q4_0   112.1 GFLOPS ( 53 runs) | Q4_1   100.9 GFLOPS ( 47 runs)
-1024 x 1024: Q5_0    65.4 GFLOPS ( 31 runs) | Q5_1    56.7 GFLOPS ( 27 runs) | Q8_0   140.9 GFLOPS ( 66 runs)
-1024 x 1024: F16     88.0 GFLOPS ( 41 runs) | F32     43.4 GFLOPS ( 21 runs)
-2048 x 2048: Q4_0   113.4 GFLOPS (  7 runs) | Q4_1   102.0 GFLOPS (  6 runs)
-2048 x 2048: Q5_0    67.1 GFLOPS (  4 runs) | Q5_1    57.7 GFLOPS (  4 runs) | Q8_0   142.7 GFLOPS (  9 runs)
-2048 x 2048: F16     84.6 GFLOPS (  5 runs) | F32     37.5 GFLOPS (  3 runs)
-4096 x 4096: Q4_0   113.8 GFLOPS (  3 runs) | Q4_1   102.0 GFLOPS (  3 runs)
-4096 x 4096: Q5_0    67.7 GFLOPS (  3 runs) | Q5_1    58.0 GFLOPS (  3 runs) | Q8_0   142.9 GFLOPS (  3 runs)
-4096 x 4096: F16     73.7 GFLOPS (  3 runs) | F32     36.1 GFLOPS (  3 runs)
+|      GPU |    Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
+|      --- |       --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
+| RTX 2060 | AVX2 CUDA |          tiny |   8 |   0 |   12.54 |    0.93 |    0.29 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     tiny-q5_0 |   8 |   0 |   12.73 |    0.98 |    0.24 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     tiny-q5_1 |   8 |   0 |   12.72 |    0.99 |    0.24 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |          base |   8 |   0 |   24.14 |    1.28 |    0.41 |    0.03 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     base-q5_0 |   8 |   0 |   24.58 |    1.38 |    0.35 |    0.03 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     base-q5_1 |   8 |   0 |   24.58 |    1.37 |    0.35 |    0.03 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |         small |   8 |   0 |   74.70 |    2.91 |    0.84 |    0.07 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    small-q5_0 |   8 |   0 |   76.12 |    2.84 |    0.77 |    0.08 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    small-q5_1 |   8 |   0 |   76.14 |    2.84 |    0.76 |    0.08 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |        medium |   8 |   0 |  200.69 |    6.46 |    1.83 |    0.17 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |   medium-q5_0 |   8 |   0 |  204.80 |    5.90 |    1.65 |    0.19 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |   medium-q5_1 |   8 |   0 |  205.61 |    5.85 |    1.61 |    0.19 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    medium-dis |   8 |   0 |  186.17 |    0.86 |    0.24 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |      large-v2 |   8 |   0 |  347.22 |   10.36 |    2.82 |    0.29 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA | large-v2-q5_0 |   8 |   0 |  357.06 |    8.81 |    2.58 |    0.34 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA | large-v2-q5_1 |   8 |   0 |  356.97 |    8.62 |    2.49 |    0.33 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |  large-v2-dis |   8 |   0 |  318.05 |    1.03 |    0.34 |    0.04 | 22c96b4 |


-make -j && ./scripts/bench-all.sh 1 1 0
+WHISPER_CUDA=1 make -j && ./scripts/bench-all.sh 8 1 1

-|    CPU |  Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
-|    --- |     --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| M4 Max |   METAL |          tiny |   1 |   0 |   13.12 |    0.87 |    0.29 |    0.01 | ad4e3509 |
-| M4 Max |   METAL |     tiny-q8_0 |   1 |   0 |   15.90 |    0.88 |    0.31 |    0.01 | ad4e3509 |
-| M4 Max |   METAL |          base |   1 |   0 |   23.10 |    1.42 |    0.34 |    0.02 | ad4e3509 |
-| M4 Max |   METAL |     base-q8_0 |   1 |   0 |   27.25 |    1.31 |    0.34 |    0.02 | ad4e3509 |
-| M4 Max |   METAL |         small |   1 |   0 |   71.76 |    3.02 |    0.70 |    0.06 | ad4e3509 |
-| M4 Max |   METAL |    small-q8_0 |   1 |   0 |   73.88 |    2.60 |    0.71 |    0.06 | ad4e3509 |
-| M4 Max |   METAL |        medium |   1 |   0 |  208.22 |    6.94 |    1.55 |    0.16 | ad4e3509 |
-| M4 Max |   METAL |   medium-q8_0 |   1 |   0 |  214.65 |    5.90 |    1.57 |    0.17 | ad4e3509 |
-| M4 Max |   METAL |      large-v2 |   1 |   0 |  381.72 |   11.28 |    2.51 |    0.29 | ad4e3509 |
-| M4 Max |   METAL | large-v2-q8_0 |   1 |   0 |  394.97 |    8.90 |    2.45 |    0.30 | ad4e3509 |
+|      GPU |    Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
+|      --- |       --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
+| RTX 2060 | AVX2 CUDA |          tiny |   8 |   1 |    7.21 |    0.76 |    0.29 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     tiny-q5_0 |   8 |   1 |    7.42 |    0.82 |    0.18 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     tiny-q5_1 |   8 |   1 |    7.38 |    0.82 |    0.18 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |          base |   8 |   1 |   13.49 |    1.04 |    0.36 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     base-q5_0 |   8 |   1 |   13.94 |    1.13 |    0.26 |    0.03 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |     base-q5_1 |   8 |   1 |   13.94 |    1.14 |    0.26 |    0.03 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |         small |   8 |   1 |   42.81 |    2.33 |    0.69 |    0.05 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    small-q5_0 |   8 |   1 |   44.43 |    2.25 |    0.59 |    0.06 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    small-q5_1 |   8 |   1 |   44.11 |    2.24 |    0.58 |    0.06 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |        medium |   8 |   1 |  115.47 |    5.17 |    1.45 |    0.11 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |   medium-q5_0 |   8 |   1 |  120.37 |    4.63 |    1.25 |    0.13 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |   medium-q5_1 |   8 |   1 |  120.28 |    4.55 |    1.21 |    0.13 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |    medium-dis |   8 |   1 |  101.69 |    0.75 |    0.20 |    0.02 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |      large-v2 |   8 |   1 |  205.67 |    8.49 |    2.19 |    0.18 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA | large-v2-q5_0 |   8 |   1 |  214.07 |    6.88 |    1.94 |    0.22 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA | large-v2-q5_1 |   8 |   1 |  213.98 |    6.70 |    1.86 |    0.22 | 22c96b4 |
+| RTX 2060 | AVX2 CUDA |  large-v2-dis |   8 |   1 |  176.71 |    0.91 |    0.31 |    0.03 | 22c96b4 |


-make -j && ./scripts/bench-all.sh 1 1 1
-
-|    CPU |  Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
-|    --- |     --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| M4 Max |   METAL |          tiny |   1 |   1 |   15.22 |    0.89 |    0.26 |    0.01 | ad4e3509 |
-| M4 Max |   METAL |     tiny-q8_0 |   1 |   1 |   14.70 |    0.86 |    0.26 |    0.01 | ad4e3509 |
-| M4 Max |   METAL |          base |   1 |   1 |   25.33 |    1.36 |    0.30 |    0.02 | ad4e3509 |
-| M4 Max |   METAL |     base-q8_0 |   1 |   1 |   21.27 |    1.31 |    0.30 |    0.02 | ad4e3509 |
-| M4 Max |   METAL |         small |   1 |   1 |   58.43 |    2.78 |    0.60 |    0.05 | ad4e3509 |
-| M4 Max |   METAL |    small-q8_0 |   1 |   1 |   60.26 |    2.39 |    0.60 |    0.05 | ad4e3509 |
-| M4 Max |   METAL |        medium |   1 |   1 |  169.73 |    6.03 |    1.31 |    0.14 | ad4e3509 |
-| M4 Max |   METAL |   medium-q8_0 |   1 |   1 |  176.61 |    4.99 |    1.31 |    0.14 | ad4e3509 |
-| M4 Max |   METAL |      large-v2 |   1 |   1 |  316.18 |    9.60 |    2.08 |    0.24 | ad4e3509 |
-| M4 Max |   METAL | large-v2-q8_0 |   1 |   1 |  329.59 |    7.55 |    2.08 |    0.25 | ad4e3509 |


 # V100
@ -253,33 +271,28 @@ WHISPER_CUDA=1 make -j && ./scripts/bench-all.sh 8 1 0

 |  GPU |    Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
 |  --- |       --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| V100 | AVX2 CUDA |          tiny |   8 |   0 |    6.15 |    1.02 |    0.30 |    0.01 | ad4e3509 |
-| V100 | AVX2 CUDA |     tiny-q5_1 |   8 |   0 |    5.92 |    0.96 |    0.25 |    0.01 | ad4e3509 |
-| V100 | AVX2 CUDA |          base |   8 |   0 |   10.60 |    1.43 |    0.43 |    0.02 | ad4e3509 |
-| V100 | AVX2 CUDA |     base-q5_1 |   8 |   0 |   10.80 |    1.37 |    0.36 |    0.02 | ad4e3509 |
-| V100 | AVX2 CUDA |         small |   8 |   0 |   31.83 |    2.82 |    0.87 |    0.04 | ad4e3509 |
-| V100 | AVX2 CUDA |    small-q5_1 |   8 |   0 |   31.88 |    2.68 |    0.72 |    0.04 | ad4e3509 |
-| V100 | AVX2 CUDA |        medium |   8 |   0 |   81.30 |    6.02 |    1.81 |    0.09 | ad4e3509 |
-| V100 | AVX2 CUDA |   medium-q5_0 |   8 |   0 |   83.21 |    5.44 |    1.41 |    0.10 | ad4e3509 |
-| V100 | AVX2 CUDA |      large-v2 |   8 |   0 |  134.81 |    8.64 |    2.69 |    0.14 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v2-q5_0 |   8 |   0 |  138.95 |    7.57 |    2.04 |    0.15 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v3-turbo |   8 |   0 |  124.42 |    1.37 |    0.43 |    0.02 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v3-turbo-q5_0 |   8 |   0 |  127.81 |    1.13 |    0.32 |    0.03 | ad4e3509 |
+| V100 | AVX2 CUDA |          tiny |   1 |   0 |    6.21 |    1.11 |    0.30 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |     tiny-q5_1 |   1 |   0 |    5.97 |    1.10 |    0.26 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |          base |   1 |   0 |   10.95 |    1.47 |    0.42 |    0.03 | 22c96b4 |
+| V100 | AVX2 CUDA |     base-q5_1 |   1 |   0 |   11.13 |    1.53 |    0.36 |    0.03 | 22c96b4 |
+| V100 | AVX2 CUDA |         small |   1 |   0 |   31.57 |    2.96 |    0.84 |    0.05 | 22c96b4 |
+| V100 | AVX2 CUDA |    small-q5_1 |   1 |   0 |   32.19 |    3.14 |    0.75 |    0.05 | 22c96b4 |
+| V100 | AVX2 CUDA |        medium |   1 |   0 |   85.88 |    6.49 |    1.80 |    0.10 | 22c96b4 |
+| V100 | AVX2 CUDA |   medium-q5_0 |   1 |   0 |   87.53 |    5.82 |    1.37 |    0.10 | 22c96b4 |
+| V100 | AVX2 CUDA |      large-v2 |   1 |   0 |  142.23 |    8.92 |    2.62 |    0.15 | 22c96b4 |


 WHISPER_CUDA=1 make -j && ./scripts/bench-all.sh 8 1 1

 |  GPU |    Config |         Model |  Th |  FA |    Enc. |    Dec. |    Bch5 |      PP |  Commit |
 |  --- |       --- |           --- | --- | --- |     --- |     --- |     --- |     --- |     --- |
-| V100 | AVX2 CUDA |          tiny |   8 |   1 |    4.01 |    0.90 |    0.25 |    0.01 | ad4e3509 |
-| V100 | AVX2 CUDA |     tiny-q5_1 |   8 |   1 |    4.12 |    0.88 |    0.18 |    0.01 | ad4e3509 |
-| V100 | AVX2 CUDA |          base |   8 |   1 |    7.00 |    1.30 |    0.35 |    0.01 | ad4e3509 |
-| V100 | AVX2 CUDA |     base-q5_1 |   8 |   1 |    7.22 |    1.21 |    0.26 |    0.02 | ad4e3509 |
-| V100 | AVX2 CUDA |         small |   8 |   1 |   18.68 |    2.39 |    0.69 |    0.03 | ad4e3509 |
-| V100 | AVX2 CUDA |    small-q5_1 |   8 |   1 |   19.38 |    2.32 |    0.51 |    0.03 | ad4e3509 |
-| V100 | AVX2 CUDA |        medium |   8 |   1 |   53.17 |    5.15 |    1.45 |    0.06 | ad4e3509 |
-| V100 | AVX2 CUDA |   medium-q5_0 |   8 |   1 |   55.09 |    4.64 |    1.05 |    0.07 | ad4e3509 |
-| V100 | AVX2 CUDA |      large-v2 |   8 |   1 |   85.77 |    7.57 |    2.19 |    0.10 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v2-q5_0 |   8 |   1 |   89.24 |    6.48 |    1.48 |    0.11 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v3-turbo |   8 |   1 |   75.56 |    1.25 |    0.37 |    0.02 | ad4e3509 |
-| V100 | AVX2 CUDA | large-v3-turbo-q5_0 |   8 |   1 |   78.48 |    1.01 |    0.24 |    0.02 | ad4e3509 |
+| V100 | AVX2 CUDA |          tiny |   1 |   1 |    3.96 |    0.82 |    0.24 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |     tiny-q5_1 |   1 |   1 |    4.05 |    0.85 |    0.18 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |          base |   1 |   1 |    7.21 |    1.16 |    0.36 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |     base-q5_1 |   1 |   1 |    7.39 |    1.21 |    0.26 |    0.02 | 22c96b4 |
+| V100 | AVX2 CUDA |         small |   1 |   1 |   19.81 |    2.41 |    0.71 |    0.04 | 22c96b4 |
+| V100 | AVX2 CUDA |    small-q5_1 |   1 |   1 |   20.50 |    2.31 |    0.51 |    0.04 | 22c96b4 |
+| V100 | AVX2 CUDA |        medium |   1 |   1 |   56.02 |    4.89 |    1.44 |    0.07 | 22c96b4 |
+| V100 | AVX2 CUDA |   medium-q5_0 |   1 |   1 |   57.85 |    4.73 |    1.09 |    0.08 | 22c96b4 |
+| V100 | AVX2 CUDA |      large-v2 |   1 |   1 |   92.73 |    7.18 |    2.14 |    0.10 | 22c96b4 |
+
--- a/src/whisper.cpp
+++ b/src/whisper.cpp
@ -4276,11 +4276,11 @@ void whisper_print_timings(struct whisper_context * ctx) {

        WHISPER_LOG_INFO("%s:     fallbacks = %3d p / %3d h\n", __func__, ctx->state->n_fail_p, ctx->state->n_fail_h);
        WHISPER_LOG_INFO("%s:      mel time = %8.2f ms\n", __func__, ctx->state->t_mel_us / 1000.0f);
-        WHISPER_LOG_INFO("%s:   sample time = %8.2f ms / %5d runs ( %8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_sample_us, n_sample, 1e-3f * ctx->state->t_sample_us / n_sample);
-        WHISPER_LOG_INFO("%s:   encode time = %8.2f ms / %5d runs ( %8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_encode_us, n_encode, 1e-3f * ctx->state->t_encode_us / n_encode);
-        WHISPER_LOG_INFO("%s:   decode time = %8.2f ms / %5d runs ( %8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_decode_us, n_decode, 1e-3f * ctx->state->t_decode_us / n_decode);
-        WHISPER_LOG_INFO("%s:   batchd time = %8.2f ms / %5d runs ( %8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_batchd_us, n_batchd, 1e-3f * ctx->state->t_batchd_us / n_batchd);
-        WHISPER_LOG_INFO("%s:   prompt time = %8.2f ms / %5d runs ( %8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_prompt_us, n_prompt, 1e-3f * ctx->state->t_prompt_us / n_prompt);
+        WHISPER_LOG_INFO("%s:   sample time = %8.2f ms / %5d runs (%8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_sample_us, n_sample, 1e-3f * ctx->state->t_sample_us / n_sample);
+        WHISPER_LOG_INFO("%s:   encode time = %8.2f ms / %5d runs (%8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_encode_us, n_encode, 1e-3f * ctx->state->t_encode_us / n_encode);
+        WHISPER_LOG_INFO("%s:   decode time = %8.2f ms / %5d runs (%8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_decode_us, n_decode, 1e-3f * ctx->state->t_decode_us / n_decode);
+        WHISPER_LOG_INFO("%s:   batchd time = %8.2f ms / %5d runs (%8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_batchd_us, n_batchd, 1e-3f * ctx->state->t_batchd_us / n_batchd);
+        WHISPER_LOG_INFO("%s:   prompt time = %8.2f ms / %5d runs (%8.2f ms per run)\n", __func__, 1e-3f * ctx->state->t_prompt_us, n_prompt, 1e-3f * ctx->state->t_prompt_us / n_prompt);
    }
    WHISPER_LOG_INFO("%s:    total time = %8.2f ms\n", __func__, (t_end_us - ctx->t_start_us)/1000.0f);
 }
@ -5527,13 +5527,11 @@ int whisper_full_with_state(
    const int seek_start = params.offset_ms/10;
    const int seek_end = params.duration_ms == 0 ? whisper_n_len_from_state(state) : seek_start + params.duration_ms/10;

-    // if length of spectrogram is less than 100ms (10 frames), then return
-    // basically don't process anything that is less than 100ms
-    // ref: https://github.com/ggml-org/whisper.cpp/issues/2065
-    const int delta_min = 10;
-
-    if (seek_end < seek_start + delta_min) {
-        WHISPER_LOG_WARN("%s: input is too short - %d ms < 100 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
+    // if length of spectrogram is less than 1.0s (100 frames), then return
+    // basically don't process anything that is less than 1.0s
+    // see issue #39: https://github.com/ggerganov/whisper.cpp/issues/39
+    if (seek_end < seek_start + 100) {
+        WHISPER_LOG_WARN("%s: input is too short - %d ms < 1000 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
        return 0;
    }

@ -5677,8 +5675,8 @@ int whisper_full_with_state(
                ctx, state, progress_cur, params.progress_callback_user_data);
        }

-        // if only 100ms left, then stop
-        if (seek + delta_min >= seek_end) {
+        // if only 1 second left, then stop
+        if (seek + 100 >= seek_end) {
            break;
        }

@ -6025,10 +6023,10 @@ int whisper_full_with_state(
                        // end of segment
                        if (token.id == whisper_token_eot(ctx) ||               // end of text token
                           (params.max_tokens > 0 && i >= params.max_tokens) || // max tokens per segment reached
-                           (has_ts && seek + seek_delta + delta_min >= seek_end)       // end of audio reached (100ms)
+                           (has_ts && seek + seek_delta + 100 >= seek_end)      // end of audio reached
                           ) {
                            if (result_len == 0 && !params.no_timestamps) {
-                                if (seek + seek_delta + delta_min >= seek_end) {
+                                if (seek + seek_delta + 100 >= seek_end) {
                                    result_len = i + 1;
                                } else {
                                    WHISPER_LOG_DEBUG("%s: decoder %d failed (result_len = 0)\n", __func__, j);
@ -6377,7 +6375,7 @@ int whisper_full_with_state(
                }
            }

-            // ref: https://github.com/ggml-org/whisper.cpp/pull/2629
+            // ref: https://github.com/ggerganov/whisper.cpp/pull/2629
            const bool single_timestamp_ending = tokens_cur.size() > 1 &&
                tokens_cur[tokens_cur.size() - 2].id < whisper_token_beg(ctx) &&
                tokens_cur[tokens_cur.size() - 1].id > whisper_token_beg(ctx);
--- a/tests/librispeech/.gitignore
+++ b/tests/librispeech/.gitignore
@ -1,6 +0,0 @@
-__pycache__
-*.tar.gz
-*.txt
-eval.conf
-venv
-LibriSpeech
--- a/tests/librispeech/Makefile
+++ b/tests/librispeech/Makefile
@ -1,15 +0,0 @@
-TAR_URL = https://www.openslr.org/resources/12/test-clean.tar.gz
-
-all: eval
-
-eval:
-	$(MAKE) -f eval.mk
-
-clean:
-	$(MAKE) -f eval.mk clean
-
-get-audio:
-	wget -c $(TAR_URL)
-	tar -xf test-clean.tar.gz
-
-.PHONY: all eval clean setup-venv clean-venv get-audio
--- a/tests/librispeech/README.md
+++ b/tests/librispeech/README.md
@ -1,60 +0,0 @@
-# whisper.cpp/tests/librispeech
-
-[LibriSpeech](https://www.openslr.org/12) is a standard dataset for
-training and evaluating automatic speech recognition systems.
-
-This directory contains a set of tools to evaluate the recognition
-performance of whisper.cpp on LibriSpeech corpus.
-
-## Quick Start
-
-1. (Pre-requirement) Compile `whisper-cli` and prepare the Whisper
-   model in `ggml` format.
-
-   ```
-   $ # Execute the commands below in the project root dir.
-   $ cmake -B build
-   $ cmake --build build --config Release
-   $ ./models/download-ggml-model.sh tiny
-   ```
-
-   Consult [whisper.cpp/README.md](../../README.md) for more details.
-
-2. Download the audio files from LibriSpeech project.
-
-   ```
-   $ make get-audio
-   ```
-
-3. Set up the environment to compute WER score.
-
-   ```
-   $ pip install -r requirements.txt
-   ```
-
-   For example, if you use `virtualenv`, you can set up it as follows:
-
-   ```
-   $ python3 -m venv venv
-   $ . venv/bin/activate
-   $ pip install -r requirements.txt
-   ```
-
-4. Run the benchmark test.
-
-   ```
-   $ make
-   ```
-
-## How-to guides
-
-### How to change the inferece parameters
-
-Create `eval.conf` and override variables.
-
-```
-WHISPER_MODEL = large-v3-turbo
-WHISPER_FLAGS = --no-prints --threads 8 --language en --output-txt
-```
-
-Check out `eval.mk` for more details.
--- a/tests/librispeech/eval.mk
+++ b/tests/librispeech/eval.mk
@ -1,39 +0,0 @@
-PYTHON = python
-
-WHISPER_PREFIX = ../../
-WHISPER_MODEL = tiny
-
-WHISPER_CLI = $(WHISPER_PREFIX)build/bin/whisper-cli
-WHISPER_FLAGS = --no-prints --language en --output-txt
-
-# You can create eval.conf to override the WHISPER_* variables
-# defined above.
-include eval.conf
-
-# This follows the file structure of the LibriSpeech project.
-AUDIO_SRCS = $(sort $(wildcard LibriSpeech/*/*/*/*.flac))
-TRANS_TXTS = $(addsuffix .txt, $(AUDIO_SRCS))
-
-# We output the evaluation result to this file.
-DONE = $(WHISPER_MODEL).txt
-
-all: $(DONE)
-
-$(DONE): $(TRANS_TXTS)
-	$(PYTHON) eval.py > $@.tmp
-	mv $@.tmp $@
-
-# Note: This task writes to a temporary file first to
-# create the target file atomically.
-%.flac.txt: %.flac
-	$(WHISPER_CLI) $(WHISPER_FLAGS) --model $(WHISPER_PREFIX)models/ggml-$(WHISPER_MODEL).bin --file $^ --output-file $^.tmp
-	mv $^.tmp.txt $^.txt
-
-archive:
-	tar -czf $(WHISPER_MODEL).tar.gz --exclude="*.flac" LibriSpeech $(DONE)
-
-clean:
-	@rm -f $(TRANS_TXTS)
-	@rm -f $(DONE)
-
-.PHONY: all clean
--- a/tests/librispeech/eval.py
+++ b/tests/librispeech/eval.py
@ -1,47 +0,0 @@
-import os
-import glob
-import jiwer
-from normalizers import EnglishTextNormalizer
-
-def get_reference():
-    ref = {}
-    for path in glob.glob('LibriSpeech/*/*/*/*.trans.txt'):
-        with open(path) as fp:
-            for line in fp:
-                code, text = line.strip().split(" ", maxsplit=1)
-                ref [code] = text
-    return ref
-
-def get_hypothesis():
-    hyp = {}
-    for path in glob.glob('LibriSpeech/*/*/*/*.flac.txt'):
-        with open(path) as fp:
-            text = fp.read().strip()
-        code = os.path.basename(path).replace('.flac.txt', '')
-        hyp[code] = text
-    return hyp
-
-def get_codes():
-    codes = []
-    for path in glob.glob('LibriSpeech/*/*/*/*.flac'):
-        codes.append(os.path.basename(path).replace('.flac', ''))
-    return sorted(codes)
-
-def main():
-    normalizer = EnglishTextNormalizer()
-
-    ref_orig = get_reference()
-    hyp_orig = get_hypothesis()
-
-    ref_clean = []
-    hyp_clean = []
-
-    for code in get_codes():
-        ref_clean.append(normalizer(ref_orig[code]))
-        hyp_clean.append(normalizer(hyp_orig[code]))
-
-    wer = jiwer.wer(ref_clean, hyp_clean)
-    print(f"WER: {wer * 100:.2f}%")
-
-if __name__ == '__main__':
-    main()
--- a/tests/librispeech/normalizers/LICENSE
+++ b/tests/librispeech/normalizers/LICENSE
@ -1,25 +0,0 @@
-Code in this directory is adapted from OpenAI Whisper project
-(https://github.com/openai/whisper) and carries the following
-copyright and license.
-
-    MIT License
-
-    Copyright (c) 2022 OpenAI
-
-    Permission is hereby granted, free of charge, to any person obtaining a copy
-    of this software and associated documentation files (the "Software"), to deal
-    in the Software without restriction, including without limitation the rights
-    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
-    copies of the Software, and to permit persons to whom the Software is
-    furnished to do so, subject to the following conditions:
-
-    The above copyright notice and this permission notice shall be included in all
-    copies or substantial portions of the Software.
-
-    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
-    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
-    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
-    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
-    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-    SOFTWARE.
--- a/tests/librispeech/normalizers/init.py
+++ b/tests/librispeech/normalizers/init.py
@ -1,2 +0,0 @@
-from .basic import BasicTextNormalizer as BasicTextNormalizer
-from .english import EnglishTextNormalizer as EnglishTextNormalizer
--- a/tests/librispeech/normalizers/basic.py
+++ b/tests/librispeech/normalizers/basic.py
@ -1,80 +0,0 @@
-import re
-import unicodedata
-
-import regex
-
-# non-ASCII letters that are not separated by "NFKD" normalization
-ADDITIONAL_DIACRITICS = {
-    "œ": "oe",
-    "Œ": "OE",
-    "ø": "o",
-    "Ø": "O",
-    "æ": "ae",
-    "Æ": "AE",
-    "ß": "ss",
-    "ẞ": "SS",
-    "đ": "d",
-    "Đ": "D",
-    "ð": "d",
-    "Ð": "D",
-    "þ": "th",
-    "Þ": "th",
-    "ł": "l",
-    "Ł": "L",
-}
-
-
-def remove_symbols_and_diacritics(s: str, keep=""):
-    """
-    Replace any other markers, symbols, and punctuations with a space,
-    and drop any diacritics (category 'Mn' and some manual mappings)
-    """
-    return "".join(
-        (
-            c
-            if c in keep
-            else (
-                ADDITIONAL_DIACRITICS[c]
-                if c in ADDITIONAL_DIACRITICS
-                else (
-                    ""
-                    if unicodedata.category(c) == "Mn"
-                    else " " if unicodedata.category(c)[0] in "MSP" else c
-                )
-            )
-        )
-        for c in unicodedata.normalize("NFKD", s)
-    )
-
-
-def remove_symbols(s: str):
-    """
-    Replace any other markers, symbols, punctuations with a space, keeping diacritics
-    """
-    return "".join(
-        " " if unicodedata.category(c)[0] in "MSP" else c
-        for c in unicodedata.normalize("NFKC", s)
-    )
-
-
-class BasicTextNormalizer:
-    def __init__(self, remove_diacritics: bool = False, split_letters: bool = False):
-        self.clean = (
-            remove_symbols_and_diacritics if remove_diacritics else remove_symbols
-        )
-        self.split_letters = split_letters
-
-    def __call__(self, s: str):
-        s = s.lower()
-        s = re.sub(r"[<\[][^>\]]*[>\]]", "", s)  # remove words between brackets
-        s = re.sub(r"\(([^)]+?)\)", "", s)  # remove words between parenthesis
-        s = self.clean(s).lower()
-
-        if self.split_letters:
-            s = " ".join(regex.findall(r"\X", s, regex.U))
-
-        s = re.sub(
-            r"\s+", " ", s
-        )  # replace any successive whitespace characters with a space
-
-        return s
--- a/tests/librispeech/normalizers/english.json
+++ b/tests/librispeech/normalizers/english.json
--- a/tests/librispeech/normalizers/english.py
+++ b/tests/librispeech/normalizers/english.py
@ -1,550 +0,0 @@
-import json
-import os
-import re
-from fractions import Fraction
-from typing import Iterator, List, Match, Optional, Union
-
-from more_itertools import windowed
-
-from .basic import remove_symbols_and_diacritics
-
-
-class EnglishNumberNormalizer:
-    """
-    Convert any spelled-out numbers into arabic numbers, while handling:
-
-    - remove any commas
-    - keep the suffixes such as: `1960s`, `274th`, `32nd`, etc.
-    - spell out currency symbols after the number. e.g. `$20 million` -> `20000000 dollars`
-    - spell out `one` and `ones`
-    - interpret successive single-digit numbers as nominal: `one oh one` -> `101`
-    """
-
-    def __init__(self):
-        super().__init__()
-
-        self.zeros = {"o", "oh", "zero"}
-        self.ones = {
-            name: i
-            for i, name in enumerate(
-                [
-                    "one",
-                    "two",
-                    "three",
-                    "four",
-                    "five",
-                    "six",
-                    "seven",
-                    "eight",
-                    "nine",
-                    "ten",
-                    "eleven",
-                    "twelve",
-                    "thirteen",
-                    "fourteen",
-                    "fifteen",
-                    "sixteen",
-                    "seventeen",
-                    "eighteen",
-                    "nineteen",
-                ],
-                start=1,
-            )
-        }
-        self.ones_plural = {
-            "sixes" if name == "six" else name + "s": (value, "s")
-            for name, value in self.ones.items()
-        }
-        self.ones_ordinal = {
-            "zeroth": (0, "th"),
-            "first": (1, "st"),
-            "second": (2, "nd"),
-            "third": (3, "rd"),
-            "fifth": (5, "th"),
-            "twelfth": (12, "th"),
-            **{
-                name + ("h" if name.endswith("t") else "th"): (value, "th")
-                for name, value in self.ones.items()
-                if value > 3 and value != 5 and value != 12
-            },
-        }
-        self.ones_suffixed = {**self.ones_plural, **self.ones_ordinal}
-
-        self.tens = {
-            "twenty": 20,
-            "thirty": 30,
-            "forty": 40,
-            "fifty": 50,
-            "sixty": 60,
-            "seventy": 70,
-            "eighty": 80,
-            "ninety": 90,
-        }
-        self.tens_plural = {
-            name.replace("y", "ies"): (value, "s") for name, value in self.tens.items()
-        }
-        self.tens_ordinal = {
-            name.replace("y", "ieth"): (value, "th")
-            for name, value in self.tens.items()
-        }
-        self.tens_suffixed = {**self.tens_plural, **self.tens_ordinal}
-
-        self.multipliers = {
-            "hundred": 100,
-            "thousand": 1_000,
-            "million": 1_000_000,
-            "billion": 1_000_000_000,
-            "trillion": 1_000_000_000_000,
-            "quadrillion": 1_000_000_000_000_000,
-            "quintillion": 1_000_000_000_000_000_000,
-            "sextillion": 1_000_000_000_000_000_000_000,
-            "septillion": 1_000_000_000_000_000_000_000_000,
-            "octillion": 1_000_000_000_000_000_000_000_000_000,
-            "nonillion": 1_000_000_000_000_000_000_000_000_000_000,
-            "decillion": 1_000_000_000_000_000_000_000_000_000_000_000,
-        }
-        self.multipliers_plural = {
-            name + "s": (value, "s") for name, value in self.multipliers.items()
-        }
-        self.multipliers_ordinal = {
-            name + "th": (value, "th") for name, value in self.multipliers.items()
-        }
-        self.multipliers_suffixed = {
-            **self.multipliers_plural,
-            **self.multipliers_ordinal,
-        }
-        self.decimals = {*self.ones, *self.tens, *self.zeros}
-
-        self.preceding_prefixers = {
-            "minus": "-",
-            "negative": "-",
-            "plus": "+",
-            "positive": "+",
-        }
-        self.following_prefixers = {
-            "pound": "£",
-            "pounds": "£",
-            "euro": "€",
-            "euros": "€",
-            "dollar": "$",
-            "dollars": "$",
-            "cent": "¢",
-            "cents": "¢",
-        }
-        self.prefixes = set(
-            list(self.preceding_prefixers.values())
-            + list(self.following_prefixers.values())
-        )
-        self.suffixers = {
-            "per": {"cent": "%"},
-            "percent": "%",
-        }
-        self.specials = {"and", "double", "triple", "point"}
-
-        self.words = set(
-            [
-                key
-                for mapping in [
-                    self.zeros,
-                    self.ones,
-                    self.ones_suffixed,
-                    self.tens,
-                    self.tens_suffixed,
-                    self.multipliers,
-                    self.multipliers_suffixed,
-                    self.preceding_prefixers,
-                    self.following_prefixers,
-                    self.suffixers,
-                    self.specials,
-                ]
-                for key in mapping
-            ]
-        )
-        self.literal_words = {"one", "ones"}
-
-    def process_words(self, words: List[str]) -> Iterator[str]:
-        prefix: Optional[str] = None
-        value: Optional[Union[str, int]] = None
-        skip = False
-
-        def to_fraction(s: str):
-            try:
-                return Fraction(s)
-            except ValueError:
-                return None
-
-        def output(result: Union[str, int]):
-            nonlocal prefix, value
-            result = str(result)
-            if prefix is not None:
-                result = prefix + result
-            value = None
-            prefix = None
-            return result
-
-        if len(words) == 0:
-            return
-
-        for prev, current, next in windowed([None] + words + [None], 3):
-            if skip:
-                skip = False
-                continue
-
-            next_is_numeric = next is not None and re.match(r"^\d+(\.\d+)?$", next)
-            has_prefix = current[0] in self.prefixes
-            current_without_prefix = current[1:] if has_prefix else current
-            if re.match(r"^\d+(\.\d+)?$", current_without_prefix):
-                # arabic numbers (potentially with signs and fractions)
-                f = to_fraction(current_without_prefix)
-                assert f is not None
-                if value is not None:
-                    if isinstance(value, str) and value.endswith("."):
-                        # concatenate decimals / ip address components
-                        value = str(value) + str(current)
-                        continue
-                    else:
-                        yield output(value)
-
-                prefix = current[0] if has_prefix else prefix
-                if f.denominator == 1:
-                    value = f.numerator  # store integers as int
-                else:
-                    value = current_without_prefix
-            elif current not in self.words:
-                # non-numeric words
-                if value is not None:
-                    yield output(value)
-                yield output(current)
-            elif current in self.zeros:
-                value = str(value or "") + "0"
-            elif current in self.ones:
-                ones = self.ones[current]
-
-                if value is None:
-                    value = ones
-                elif isinstance(value, str) or prev in self.ones:
-                    if (
-                        prev in self.tens and ones < 10
-                    ):  # replace the last zero with the digit
-                        assert value[-1] == "0"
-                        value = value[:-1] + str(ones)
-                    else:
-                        value = str(value) + str(ones)
-                elif ones < 10:
-                    if value % 10 == 0:
-                        value += ones
-                    else:
-                        value = str(value) + str(ones)
-                else:  # eleven to nineteen
-                    if value % 100 == 0:
-                        value += ones
-                    else:
-                        value = str(value) + str(ones)
-            elif current in self.ones_suffixed:
-                # ordinal or cardinal; yield the number right away
-                ones, suffix = self.ones_suffixed[current]
-                if value is None:
-                    yield output(str(ones) + suffix)
-                elif isinstance(value, str) or prev in self.ones:
-                    if prev in self.tens and ones < 10:
-                        assert value[-1] == "0"
-                        yield output(value[:-1] + str(ones) + suffix)
-                    else:
-                        yield output(str(value) + str(ones) + suffix)
-                elif ones < 10:
-                    if value % 10 == 0:
-                        yield output(str(value + ones) + suffix)
-                    else:
-                        yield output(str(value) + str(ones) + suffix)
-                else:  # eleven to nineteen
-                    if value % 100 == 0:
-                        yield output(str(value + ones) + suffix)
-                    else:
-                        yield output(str(value) + str(ones) + suffix)
-                value = None
-            elif current in self.tens:
-                tens = self.tens[current]
-                if value is None:
-                    value = tens
-                elif isinstance(value, str):
-                    value = str(value) + str(tens)
-                else:
-                    if value % 100 == 0:
-                        value += tens
-                    else:
-                        value = str(value) + str(tens)
-            elif current in self.tens_suffixed:
-                # ordinal or cardinal; yield the number right away
-                tens, suffix = self.tens_suffixed[current]
-                if value is None:
-                    yield output(str(tens) + suffix)
-                elif isinstance(value, str):
-                    yield output(str(value) + str(tens) + suffix)
-                else:
-                    if value % 100 == 0:
-                        yield output(str(value + tens) + suffix)
-                    else:
-                        yield output(str(value) + str(tens) + suffix)
-            elif current in self.multipliers:
-                multiplier = self.multipliers[current]
-                if value is None:
-                    value = multiplier
-                elif isinstance(value, str) or value == 0:
-                    f = to_fraction(value)
-                    p = f * multiplier if f is not None else None
-                    if f is not None and p.denominator == 1:
-                        value = p.numerator
-                    else:
-                        yield output(value)
-                        value = multiplier
-                else:
-                    before = value // 1000 * 1000
-                    residual = value % 1000
-                    value = before + residual * multiplier
-            elif current in self.multipliers_suffixed:
-                multiplier, suffix = self.multipliers_suffixed[current]
-                if value is None:
-                    yield output(str(multiplier) + suffix)
-                elif isinstance(value, str):
-                    f = to_fraction(value)
-                    p = f * multiplier if f is not None else None
-                    if f is not None and p.denominator == 1:
-                        yield output(str(p.numerator) + suffix)
-                    else:
-                        yield output(value)
-                        yield output(str(multiplier) + suffix)
-                else:  # int
-                    before = value // 1000 * 1000
-                    residual = value % 1000
-                    value = before + residual * multiplier
-                    yield output(str(value) + suffix)
-                value = None
-            elif current in self.preceding_prefixers:
-                # apply prefix (positive, minus, etc.) if it precedes a number
-                if value is not None:
-                    yield output(value)
-
-                if next in self.words or next_is_numeric:
-                    prefix = self.preceding_prefixers[current]
-                else:
-                    yield output(current)
-            elif current in self.following_prefixers:
-                # apply prefix (dollars, cents, etc.) only after a number
-                if value is not None:
-                    prefix = self.following_prefixers[current]
-                    yield output(value)
-                else:
-                    yield output(current)
-            elif current in self.suffixers:
-                # apply suffix symbols (percent -> '%')
-                if value is not None:
-                    suffix = self.suffixers[current]
-                    if isinstance(suffix, dict):
-                        if next in suffix:
-                            yield output(str(value) + suffix[next])
-                            skip = True
-                        else:
-                            yield output(value)
-                            yield output(current)
-                    else:
-                        yield output(str(value) + suffix)
-                else:
-                    yield output(current)
-            elif current in self.specials:
-                if next not in self.words and not next_is_numeric:
-                    # apply special handling only if the next word can be numeric
-                    if value is not None:
-                        yield output(value)
-                    yield output(current)
-                elif current == "and":
-                    # ignore "and" after hundreds, thousands, etc.
-                    if prev not in self.multipliers:
-                        if value is not None:
-                            yield output(value)
-                        yield output(current)
-                elif current == "double" or current == "triple":
-                    if next in self.ones or next in self.zeros:
-                        repeats = 2 if current == "double" else 3
-                        ones = self.ones.get(next, 0)
-                        value = str(value or "") + str(ones) * repeats
-                        skip = True
-                    else:
-                        if value is not None:
-                            yield output(value)
-                        yield output(current)
-                elif current == "point":
-                    if next in self.decimals or next_is_numeric:
-                        value = str(value or "") + "."
-                else:
-                    # should all have been covered at this point
-                    raise ValueError(f"Unexpected token: {current}")
-            else:
-                # all should have been covered at this point
-                raise ValueError(f"Unexpected token: {current}")
-
-        if value is not None:
-            yield output(value)
-
-    def preprocess(self, s: str):
-        # replace "<number> and a half" with "<number> point five"
-        results = []
-
-        segments = re.split(r"\band\s+a\s+half\b", s)
-        for i, segment in enumerate(segments):
-            if len(segment.strip()) == 0:
-                continue
-            if i == len(segments) - 1:
-                results.append(segment)
-            else:
-                results.append(segment)
-                last_word = segment.rsplit(maxsplit=2)[-1]
-                if last_word in self.decimals or last_word in self.multipliers:
-                    results.append("point five")
-                else:
-                    results.append("and a half")
-
-        s = " ".join(results)
-
-        # put a space at number/letter boundary
-        s = re.sub(r"([a-z])([0-9])", r"\1 \2", s)
-        s = re.sub(r"([0-9])([a-z])", r"\1 \2", s)
-
-        # but remove spaces which could be a suffix
-        s = re.sub(r"([0-9])\s+(st|nd|rd|th|s)\b", r"\1\2", s)
-
-        return s
-
-    def postprocess(self, s: str):
-        def combine_cents(m: Match):
-            try:
-                currency = m.group(1)
-                integer = m.group(2)
-                cents = int(m.group(3))
-                return f"{currency}{integer}.{cents:02d}"
-            except ValueError:
-                return m.string
-
-        def extract_cents(m: Match):
-            try:
-                return f"¢{int(m.group(1))}"
-            except ValueError:
-                return m.string
-
-        # apply currency postprocessing; "$2 and ¢7" -> "$2.07"
-        s = re.sub(r"([€£$])([0-9]+) (?:and )?¢([0-9]{1,2})\b", combine_cents, s)
-        s = re.sub(r"[€£$]0.([0-9]{1,2})\b", extract_cents, s)
-
-        # write "one(s)" instead of "1(s)", just for the readability
-        s = re.sub(r"\b1(s?)\b", r"one\1", s)
-
-        return s
-
-    def __call__(self, s: str):
-        s = self.preprocess(s)
-        s = " ".join(word for word in self.process_words(s.split()) if word is not None)
-        s = self.postprocess(s)
-
-        return s
-
-
-class EnglishSpellingNormalizer:
-    """
-    Applies British-American spelling mappings as listed in [1].
-
-    [1] https://www.tysto.com/uk-us-spelling-list.html
-    """
-
-    def __init__(self):
-        mapping_path = os.path.join(os.path.dirname(__file__), "english.json")
-        self.mapping = json.load(open(mapping_path))
-
-    def __call__(self, s: str):
-        return " ".join(self.mapping.get(word, word) for word in s.split())
-
-
-class EnglishTextNormalizer:
-    def __init__(self):
-        self.ignore_patterns = r"\b(hmm|mm|mhm|mmm|uh|um)\b"
-        self.replacers = {
-            # common contractions
-            r"\bwon't\b": "will not",
-            r"\bcan't\b": "can not",
-            r"\blet's\b": "let us",
-            r"\bain't\b": "aint",
-            r"\by'all\b": "you all",
-            r"\bwanna\b": "want to",
-            r"\bgotta\b": "got to",
-            r"\bgonna\b": "going to",
-            r"\bi'ma\b": "i am going to",
-            r"\bimma\b": "i am going to",
-            r"\bwoulda\b": "would have",
-            r"\bcoulda\b": "could have",
-            r"\bshoulda\b": "should have",
-            r"\bma'am\b": "madam",
-            # contractions in titles/prefixes
-            r"\bmr\b": "mister ",
-            r"\bmrs\b": "missus ",
-            r"\bst\b": "saint ",
-            r"\bdr\b": "doctor ",
-            r"\bprof\b": "professor ",
-            r"\bcapt\b": "captain ",
-            r"\bgov\b": "governor ",
-            r"\bald\b": "alderman ",
-            r"\bgen\b": "general ",
-            r"\bsen\b": "senator ",
-            r"\brep\b": "representative ",
-            r"\bpres\b": "president ",
-            r"\brev\b": "reverend ",
-            r"\bhon\b": "honorable ",
-            r"\basst\b": "assistant ",
-            r"\bassoc\b": "associate ",
-            r"\blt\b": "lieutenant ",
-            r"\bcol\b": "colonel ",
-            r"\bjr\b": "junior ",
-            r"\bsr\b": "senior ",
-            r"\besq\b": "esquire ",
-            # prefect tenses, ideally it should be any past participles, but it's harder..
-            r"'d been\b": " had been",
-            r"'s been\b": " has been",
-            r"'d gone\b": " had gone",
-            r"'s gone\b": " has gone",
-            r"'d done\b": " had done",  # "'s done" is ambiguous
-            r"'s got\b": " has got",
-            # general contractions
-            r"n't\b": " not",
-            r"'re\b": " are",
-            r"'s\b": " is",
-            r"'d\b": " would",
-            r"'ll\b": " will",
-            r"'t\b": " not",
-            r"'ve\b": " have",
-            r"'m\b": " am",
-        }
-        self.standardize_numbers = EnglishNumberNormalizer()
-        self.standardize_spellings = EnglishSpellingNormalizer()
-
-    def __call__(self, s: str):
-        s = s.lower()
-
-        s = re.sub(r"[<\[][^>\]]*[>\]]", "", s)  # remove words between brackets
-        s = re.sub(r"\(([^)]+?)\)", "", s)  # remove words between parenthesis
-        s = re.sub(self.ignore_patterns, "", s)
-        s = re.sub(r"\s+'", "'", s)  # when there's a space before an apostrophe
-
-        for pattern, replacement in self.replacers.items():
-            s = re.sub(pattern, replacement, s)
-
-        s = re.sub(r"(\d),(\d)", r"\1\2", s)  # remove commas between digits
-        s = re.sub(r"\.([^0-9]|$)", r" \1", s)  # remove periods not followed by numbers
-        s = remove_symbols_and_diacritics(s, keep=".%$¢€£")  # keep numeric symbols
-
-        s = self.standardize_numbers(s)
-        s = self.standardize_spellings(s)
-
-        # now remove prefix/suffix symbols that are not preceded/followed by numbers
-        s = re.sub(r"[.$¢€£]([^0-9])", r" \1", s)
-        s = re.sub(r"([^0-9])%", r"\1 ", s)
-
-        s = re.sub(r"\s+", " ", s)  # replace any successive whitespaces with a space
-
-        return s
--- a/tests/librispeech/requirements.txt
+++ b/tests/librispeech/requirements.txt
@ -1,6 +0,0 @@
-# This is the minimal set of dependencies we need to compute
-# WER score. Read Section 3.2. of the original paper
-# (https://arxiv.org/abs/2212.04356) for more contexts.
-jiwer
-regex
-more-itertools