mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2025-04-23 10:51:22 +02:00
rename : ggerganov -> ggml-org (#3005)
This commit is contained in:
parent
0b17d4507e
commit
2b6d0d2200
63
README.md
63
README.md
@ -2,12 +2,12 @@
|
|||||||
|
|
||||||

|

|
||||||
|
|
||||||
[](https://github.com/ggerganov/whisper.cpp/actions)
|
[](https://github.com/ggml-org/whisper.cpp/actions)
|
||||||
[](https://opensource.org/licenses/MIT)
|
[](https://opensource.org/licenses/MIT)
|
||||||
[](https://conan.io/center/whisper-cpp)
|
[](https://conan.io/center/whisper-cpp)
|
||||||
[](https://www.npmjs.com/package/whisper.cpp/)
|
[](https://www.npmjs.com/package/whisper.cpp/)
|
||||||
|
|
||||||
Stable: [v1.7.5](https://github.com/ggerganov/whisper.cpp/releases/tag/v1.7.5) / [Roadmap](https://github.com/users/ggerganov/projects/16/)
|
Stable: [v1.7.5](https://github.com/ggml-org/whisper.cpp/releases/tag/v1.7.5) / [Roadmap](https://github.com/orgs/ggml-org/projects/4/)
|
||||||
|
|
||||||
High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisper) automatic speech recognition (ASR) model:
|
High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisper) automatic speech recognition (ASR) model:
|
||||||
|
|
||||||
@ -23,7 +23,7 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
|
|||||||
- [Efficient GPU support for NVIDIA](#nvidia-gpu-support)
|
- [Efficient GPU support for NVIDIA](#nvidia-gpu-support)
|
||||||
- [OpenVINO Support](#openvino-support)
|
- [OpenVINO Support](#openvino-support)
|
||||||
- [Ascend NPU Support](#ascend-npu-support)
|
- [Ascend NPU Support](#ascend-npu-support)
|
||||||
- [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/include/whisper.h)
|
- [C-style API](https://github.com/ggml-org/whisper.cpp/blob/master/include/whisper.h)
|
||||||
|
|
||||||
Supported platforms:
|
Supported platforms:
|
||||||
|
|
||||||
@ -31,14 +31,14 @@ Supported platforms:
|
|||||||
- [x] [iOS](examples/whisper.objc)
|
- [x] [iOS](examples/whisper.objc)
|
||||||
- [x] [Android](examples/whisper.android)
|
- [x] [Android](examples/whisper.android)
|
||||||
- [x] [Java](bindings/java/README.md)
|
- [x] [Java](bindings/java/README.md)
|
||||||
- [x] Linux / [FreeBSD](https://github.com/ggerganov/whisper.cpp/issues/56#issuecomment-1350920264)
|
- [x] Linux / [FreeBSD](https://github.com/ggml-org/whisper.cpp/issues/56#issuecomment-1350920264)
|
||||||
- [x] [WebAssembly](examples/whisper.wasm)
|
- [x] [WebAssembly](examples/whisper.wasm)
|
||||||
- [x] Windows ([MSVC](https://github.com/ggerganov/whisper.cpp/blob/master/.github/workflows/build.yml#L117-L144) and [MinGW](https://github.com/ggerganov/whisper.cpp/issues/168)]
|
- [x] Windows ([MSVC](https://github.com/ggml-org/whisper.cpp/blob/master/.github/workflows/build.yml#L117-L144) and [MinGW](https://github.com/ggml-org/whisper.cpp/issues/168)]
|
||||||
- [x] [Raspberry Pi](https://github.com/ggerganov/whisper.cpp/discussions/166)
|
- [x] [Raspberry Pi](https://github.com/ggml-org/whisper.cpp/discussions/166)
|
||||||
- [x] [Docker](https://github.com/ggerganov/whisper.cpp/pkgs/container/whisper.cpp)
|
- [x] [Docker](https://github.com/ggml-org/whisper.cpp/pkgs/container/whisper.cpp)
|
||||||
|
|
||||||
The entire high-level implementation of the model is contained in [whisper.h](include/whisper.h) and [whisper.cpp](src/whisper.cpp).
|
The entire high-level implementation of the model is contained in [whisper.h](include/whisper.h) and [whisper.cpp](src/whisper.cpp).
|
||||||
The rest of the code is part of the [`ggml`](https://github.com/ggerganov/ggml) machine learning library.
|
The rest of the code is part of the [`ggml`](https://github.com/ggml-org/ggml) machine learning library.
|
||||||
|
|
||||||
Having such a lightweight implementation of the model allows to easily integrate it in different platforms and applications.
|
Having such a lightweight implementation of the model allows to easily integrate it in different platforms and applications.
|
||||||
As an example, here is a video of running the model on an iPhone 13 device - fully offline, on-device: [whisper.objc](examples/whisper.objc)
|
As an example, here is a video of running the model on an iPhone 13 device - fully offline, on-device: [whisper.objc](examples/whisper.objc)
|
||||||
@ -51,14 +51,14 @@ https://user-images.githubusercontent.com/1991296/204038393-2f846eae-c255-4099-a
|
|||||||
|
|
||||||
On Apple Silicon, the inference runs fully on the GPU via Metal:
|
On Apple Silicon, the inference runs fully on the GPU via Metal:
|
||||||
|
|
||||||
https://github.com/ggerganov/whisper.cpp/assets/1991296/c82e8f86-60dc-49f2-b048-d2fdbd6b5225
|
https://github.com/ggml-org/whisper.cpp/assets/1991296/c82e8f86-60dc-49f2-b048-d2fdbd6b5225
|
||||||
|
|
||||||
## Quick start
|
## Quick start
|
||||||
|
|
||||||
First clone the repository:
|
First clone the repository:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/ggerganov/whisper.cpp.git
|
git clone https://github.com/ggml-org/whisper.cpp.git
|
||||||
```
|
```
|
||||||
|
|
||||||
Navigate into the directory:
|
Navigate into the directory:
|
||||||
@ -222,7 +222,7 @@ speed-up - more than x3 faster compared with CPU-only execution. Here are the in
|
|||||||
The first run on a device is slow, since the ANE service compiles the Core ML model to some device-specific format.
|
The first run on a device is slow, since the ANE service compiles the Core ML model to some device-specific format.
|
||||||
Next runs are faster.
|
Next runs are faster.
|
||||||
|
|
||||||
For more information about the Core ML implementation please refer to PR [#566](https://github.com/ggerganov/whisper.cpp/pull/566).
|
For more information about the Core ML implementation please refer to PR [#566](https://github.com/ggml-org/whisper.cpp/pull/566).
|
||||||
|
|
||||||
## OpenVINO support
|
## OpenVINO support
|
||||||
|
|
||||||
@ -307,7 +307,7 @@ This can result in significant speedup in encoder performance. Here are the inst
|
|||||||
The first time run on an OpenVINO device is slow, since the OpenVINO framework will compile the IR (Intermediate Representation) model to a device-specific 'blob'. This device-specific blob will get
|
The first time run on an OpenVINO device is slow, since the OpenVINO framework will compile the IR (Intermediate Representation) model to a device-specific 'blob'. This device-specific blob will get
|
||||||
cached for the next run.
|
cached for the next run.
|
||||||
|
|
||||||
For more information about the OpenVINO implementation please refer to PR [#1037](https://github.com/ggerganov/whisper.cpp/pull/1037).
|
For more information about the OpenVINO implementation please refer to PR [#1037](https://github.com/ggml-org/whisper.cpp/pull/1037).
|
||||||
|
|
||||||
## NVIDIA GPU support
|
## NVIDIA GPU support
|
||||||
|
|
||||||
@ -385,8 +385,8 @@ Run the inference examples as usual, for example:
|
|||||||
|
|
||||||
We have two Docker images available for this project:
|
We have two Docker images available for this project:
|
||||||
|
|
||||||
1. `ghcr.io/ggerganov/whisper.cpp:main`: This image includes the main executable file as well as `curl` and `ffmpeg`. (platforms: `linux/amd64`, `linux/arm64`)
|
1. `ghcr.io/ggml-org/whisper.cpp:main`: This image includes the main executable file as well as `curl` and `ffmpeg`. (platforms: `linux/amd64`, `linux/arm64`)
|
||||||
2. `ghcr.io/ggerganov/whisper.cpp:main-cuda`: Same as `main` but compiled with CUDA support. (platforms: `linux/amd64`)
|
2. `ghcr.io/ggml-org/whisper.cpp:main-cuda`: Same as `main` but compiled with CUDA support. (platforms: `linux/amd64`)
|
||||||
|
|
||||||
### Usage
|
### Usage
|
||||||
|
|
||||||
@ -424,7 +424,7 @@ For detailed instructions on how to use Conan, please refer to the [Conan docume
|
|||||||
|
|
||||||
This is a naive example of performing real-time inference on audio from your microphone.
|
This is a naive example of performing real-time inference on audio from your microphone.
|
||||||
The [stream](examples/stream) tool samples the audio every half a second and runs the transcription continuously.
|
The [stream](examples/stream) tool samples the audio every half a second and runs the transcription continuously.
|
||||||
More info is available in [issue #10](https://github.com/ggerganov/whisper.cpp/issues/10).
|
More info is available in [issue #10](https://github.com/ggml-org/whisper.cpp/issues/10).
|
||||||
You will need to have [sdl2](https://wiki.libsdl.org/SDL2/Installation) installed for it to work properly.
|
You will need to have [sdl2](https://wiki.libsdl.org/SDL2/Installation) installed for it to work properly.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
@ -513,7 +513,7 @@ main: processing './samples/jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 pr
|
|||||||
|
|
||||||
## Speaker segmentation via tinydiarize (experimental)
|
## Speaker segmentation via tinydiarize (experimental)
|
||||||
|
|
||||||
More information about this approach is available here: https://github.com/ggerganov/whisper.cpp/pull/1058
|
More information about this approach is available here: https://github.com/ggml-org/whisper.cpp/pull/1058
|
||||||
|
|
||||||
Sample usage:
|
Sample usage:
|
||||||
|
|
||||||
@ -577,7 +577,7 @@ https://user-images.githubusercontent.com/1991296/199337538-b7b0c7a3-2753-4a88-a
|
|||||||
|
|
||||||
## Video comparison of different models
|
## Video comparison of different models
|
||||||
|
|
||||||
Use the [scripts/bench-wts.sh](https://github.com/ggerganov/whisper.cpp/blob/master/scripts/bench-wts.sh) script to generate a video in the following format:
|
Use the [scripts/bench-wts.sh](https://github.com/ggml-org/whisper.cpp/blob/master/scripts/bench-wts.sh) script to generate a video in the following format:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./scripts/bench-wts.sh samples/jfk.wav
|
./scripts/bench-wts.sh samples/jfk.wav
|
||||||
@ -594,7 +594,7 @@ In order to have an objective comparison of the performance of the inference acr
|
|||||||
use the [whisper-bench](examples/bench) tool. The tool simply runs the Encoder part of the model and prints how much time it
|
use the [whisper-bench](examples/bench) tool. The tool simply runs the Encoder part of the model and prints how much time it
|
||||||
took to execute it. The results are summarized in the following Github issue:
|
took to execute it. The results are summarized in the following Github issue:
|
||||||
|
|
||||||
[Benchmark results](https://github.com/ggerganov/whisper.cpp/issues/89)
|
[Benchmark results](https://github.com/ggml-org/whisper.cpp/issues/89)
|
||||||
|
|
||||||
Additionally a script to run whisper.cpp with different models and audio files is provided [bench.py](scripts/bench.py).
|
Additionally a script to run whisper.cpp with different models and audio files is provided [bench.py](scripts/bench.py).
|
||||||
|
|
||||||
@ -621,25 +621,24 @@ You can download the converted models using the [models/download-ggml-model.sh](
|
|||||||
or manually from here:
|
or manually from here:
|
||||||
|
|
||||||
- https://huggingface.co/ggerganov/whisper.cpp
|
- https://huggingface.co/ggerganov/whisper.cpp
|
||||||
- https://ggml.ggerganov.com
|
|
||||||
|
|
||||||
For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or [models/README.md](models/README.md).
|
For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or [models/README.md](models/README.md).
|
||||||
|
|
||||||
## [Bindings](https://github.com/ggerganov/whisper.cpp/discussions/categories/bindings)
|
## [Bindings](https://github.com/ggml-org/whisper.cpp/discussions/categories/bindings)
|
||||||
|
|
||||||
- [x] Rust: [tazz4843/whisper-rs](https://github.com/tazz4843/whisper-rs) | [#310](https://github.com/ggerganov/whisper.cpp/discussions/310)
|
- [x] Rust: [tazz4843/whisper-rs](https://github.com/tazz4843/whisper-rs) | [#310](https://github.com/ggml-org/whisper.cpp/discussions/310)
|
||||||
- [x] JavaScript: [bindings/javascript](bindings/javascript) | [#309](https://github.com/ggerganov/whisper.cpp/discussions/309)
|
- [x] JavaScript: [bindings/javascript](bindings/javascript) | [#309](https://github.com/ggml-org/whisper.cpp/discussions/309)
|
||||||
- React Native (iOS / Android): [whisper.rn](https://github.com/mybigday/whisper.rn)
|
- React Native (iOS / Android): [whisper.rn](https://github.com/mybigday/whisper.rn)
|
||||||
- [x] Go: [bindings/go](bindings/go) | [#312](https://github.com/ggerganov/whisper.cpp/discussions/312)
|
- [x] Go: [bindings/go](bindings/go) | [#312](https://github.com/ggml-org/whisper.cpp/discussions/312)
|
||||||
- [x] Java:
|
- [x] Java:
|
||||||
- [GiviMAD/whisper-jni](https://github.com/GiviMAD/whisper-jni)
|
- [GiviMAD/whisper-jni](https://github.com/GiviMAD/whisper-jni)
|
||||||
- [x] Ruby: [bindings/ruby](bindings/ruby) | [#507](https://github.com/ggerganov/whisper.cpp/discussions/507)
|
- [x] Ruby: [bindings/ruby](bindings/ruby) | [#507](https://github.com/ggml-org/whisper.cpp/discussions/507)
|
||||||
- [x] Objective-C / Swift: [ggerganov/whisper.spm](https://github.com/ggerganov/whisper.spm) | [#313](https://github.com/ggerganov/whisper.cpp/discussions/313)
|
- [x] Objective-C / Swift: [ggml-org/whisper.spm](https://github.com/ggml-org/whisper.spm) | [#313](https://github.com/ggml-org/whisper.cpp/discussions/313)
|
||||||
- [exPHAT/SwiftWhisper](https://github.com/exPHAT/SwiftWhisper)
|
- [exPHAT/SwiftWhisper](https://github.com/exPHAT/SwiftWhisper)
|
||||||
- [x] .NET: | [#422](https://github.com/ggerganov/whisper.cpp/discussions/422)
|
- [x] .NET: | [#422](https://github.com/ggml-org/whisper.cpp/discussions/422)
|
||||||
- [sandrohanea/whisper.net](https://github.com/sandrohanea/whisper.net)
|
- [sandrohanea/whisper.net](https://github.com/sandrohanea/whisper.net)
|
||||||
- [NickDarvey/whisper](https://github.com/NickDarvey/whisper)
|
- [NickDarvey/whisper](https://github.com/NickDarvey/whisper)
|
||||||
- [x] Python: | [#9](https://github.com/ggerganov/whisper.cpp/issues/9)
|
- [x] Python: | [#9](https://github.com/ggml-org/whisper.cpp/issues/9)
|
||||||
- [stlukey/whispercpp.py](https://github.com/stlukey/whispercpp.py) (Cython)
|
- [stlukey/whispercpp.py](https://github.com/stlukey/whispercpp.py) (Cython)
|
||||||
- [AIWintermuteAI/whispercpp](https://github.com/AIWintermuteAI/whispercpp) (Updated fork of aarnphm/whispercpp)
|
- [AIWintermuteAI/whispercpp](https://github.com/AIWintermuteAI/whispercpp) (Updated fork of aarnphm/whispercpp)
|
||||||
- [aarnphm/whispercpp](https://github.com/aarnphm/whispercpp) (Pybind11)
|
- [aarnphm/whispercpp](https://github.com/aarnphm/whispercpp) (Pybind11)
|
||||||
@ -667,7 +666,7 @@ let package = Package(
|
|||||||
]),
|
]),
|
||||||
.binaryTarget(
|
.binaryTarget(
|
||||||
name: "WhisperFramework",
|
name: "WhisperFramework",
|
||||||
url: "https://github.com/ggerganov/whisper.cpp/releases/download/v1.7.5/whisper-v1.7.5-xcframework.zip",
|
url: "https://github.com/ggml-org/whisper.cpp/releases/download/v1.7.5/whisper-v1.7.5-xcframework.zip",
|
||||||
checksum: "c7faeb328620d6012e130f3d705c51a6ea6c995605f2df50f6e1ad68c59c6c4a"
|
checksum: "c7faeb328620d6012e130f3d705c51a6ea6c995605f2df50f6e1ad68c59c6c4a"
|
||||||
)
|
)
|
||||||
]
|
]
|
||||||
@ -692,13 +691,13 @@ Some of the examples are even ported to run in the browser using WebAssembly. Ch
|
|||||||
| [whisper.android](examples/whisper.android) | | Android mobile application using whisper.cpp |
|
| [whisper.android](examples/whisper.android) | | Android mobile application using whisper.cpp |
|
||||||
| [whisper.nvim](examples/whisper.nvim) | | Speech-to-text plugin for Neovim |
|
| [whisper.nvim](examples/whisper.nvim) | | Speech-to-text plugin for Neovim |
|
||||||
| [generate-karaoke.sh](examples/generate-karaoke.sh) | | Helper script to easily [generate a karaoke video](https://youtu.be/uj7hVta4blM) of raw audio capture |
|
| [generate-karaoke.sh](examples/generate-karaoke.sh) | | Helper script to easily [generate a karaoke video](https://youtu.be/uj7hVta4blM) of raw audio capture |
|
||||||
| [livestream.sh](examples/livestream.sh) | | [Livestream audio transcription](https://github.com/ggerganov/whisper.cpp/issues/185) |
|
| [livestream.sh](examples/livestream.sh) | | [Livestream audio transcription](https://github.com/ggml-org/whisper.cpp/issues/185) |
|
||||||
| [yt-wsp.sh](examples/yt-wsp.sh) | | Download + transcribe and/or translate any VOD [(original)](https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818) |
|
| [yt-wsp.sh](examples/yt-wsp.sh) | | Download + transcribe and/or translate any VOD [(original)](https://gist.github.com/DaniruKun/96f763ec1a037cc92fe1a059b643b818) |
|
||||||
| [wchess](examples/wchess) | [wchess.wasm](examples/wchess) | Voice-controlled chess |
|
| [wchess](examples/wchess) | [wchess.wasm](examples/wchess) | Voice-controlled chess |
|
||||||
|
|
||||||
## [Discussions](https://github.com/ggerganov/whisper.cpp/discussions)
|
## [Discussions](https://github.com/ggml-org/whisper.cpp/discussions)
|
||||||
|
|
||||||
If you have any kind of feedback about this project feel free to use the Discussions section and open a new topic.
|
If you have any kind of feedback about this project feel free to use the Discussions section and open a new topic.
|
||||||
You can use the [Show and tell](https://github.com/ggerganov/whisper.cpp/discussions/categories/show-and-tell) category
|
You can use the [Show and tell](https://github.com/ggml-org/whisper.cpp/discussions/categories/show-and-tell) category
|
||||||
to share your own projects that use `whisper.cpp`. If you have a question, make sure to check the
|
to share your own projects that use `whisper.cpp`. If you have a question, make sure to check the
|
||||||
[Frequently asked questions (#126)](https://github.com/ggerganov/whisper.cpp/discussions/126) discussion.
|
[Frequently asked questions (#126)](https://github.com/ggml-org/whisper.cpp/discussions/126) discussion.
|
||||||
|
@ -51,7 +51,7 @@ func main() {
|
|||||||
In order to build, you need to have the Go compiler installed. You can get it from [here](https://golang.org/dl/). Run the tests with:
|
In order to build, you need to have the Go compiler installed. You can get it from [here](https://golang.org/dl/). Run the tests with:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/ggerganov/whisper.cpp.git
|
git clone https://github.com/ggml-org/whisper.cpp.git
|
||||||
cd whisper.cpp/bindings/go
|
cd whisper.cpp/bindings/go
|
||||||
make test
|
make test
|
||||||
```
|
```
|
||||||
@ -98,7 +98,7 @@ The API Documentation:
|
|||||||
|
|
||||||
Getting help:
|
Getting help:
|
||||||
|
|
||||||
* Follow the discussion for the go bindings [here](https://github.com/ggerganov/whisper.cpp/discussions/312)
|
* Follow the discussion for the go bindings [here](https://github.com/ggml-org/whisper.cpp/discussions/312)
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
/*
|
/*
|
||||||
github.com/ggerganov/whisper.cpp/bindings/go
|
github.com/ggml-org/whisper.cpp/bindings/go
|
||||||
provides a speech-to-text service bindings for the Go programming language.
|
provides a speech-to-text service bindings for the Go programming language.
|
||||||
*/
|
*/
|
||||||
package whisper
|
package whisper
|
||||||
|
@ -52,7 +52,7 @@ public class Example {
|
|||||||
In order to build, you need to have the JDK 8 or higher installed. Run the tests with:
|
In order to build, you need to have the JDK 8 or higher installed. Run the tests with:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/ggerganov/whisper.cpp.git
|
git clone https://github.com/ggml-org/whisper.cpp.git
|
||||||
cd whisper.cpp/bindings/java
|
cd whisper.cpp/bindings/java
|
||||||
|
|
||||||
./gradlew build
|
./gradlew build
|
||||||
|
@ -228,7 +228,7 @@ The second argument `samples` may be an array, an object with `length` and `each
|
|||||||
Development
|
Development
|
||||||
-----------
|
-----------
|
||||||
|
|
||||||
% git clone https://github.com/ggerganov/whisper.cpp.git
|
% git clone https://github.com/ggml-org/whisper.cpp.git
|
||||||
% cd whisper.cpp/bindings/ruby
|
% cd whisper.cpp/bindings/ruby
|
||||||
% rake test
|
% rake test
|
||||||
|
|
||||||
@ -241,5 +241,5 @@ License
|
|||||||
|
|
||||||
The same to [whisper.cpp][].
|
The same to [whisper.cpp][].
|
||||||
|
|
||||||
[whisper.cpp]: https://github.com/ggerganov/whisper.cpp
|
[whisper.cpp]: https://github.com/ggml-org/whisper.cpp
|
||||||
[models]: https://github.com/ggerganov/whisper.cpp/tree/master/models
|
[models]: https://github.com/ggml-org/whisper.cpp/tree/master/models
|
||||||
|
@ -4,7 +4,7 @@ A very basic tool for benchmarking the inference performance on your device. The
|
|||||||
the transformer on some random audio data and records the execution time. This way we can have an objective comparison
|
the transformer on some random audio data and records the execution time. This way we can have an objective comparison
|
||||||
of the performance of the model for various setups.
|
of the performance of the model for various setups.
|
||||||
|
|
||||||
Benchmark results are tracked in the following Github issue: https://github.com/ggerganov/whisper.cpp/issues/89
|
Benchmark results are tracked in the following Github issue: https://github.com/ggml-org/whisper.cpp/issues/89
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# run the bench too on the small.en model using 4 threads
|
# run the bench too on the small.en model using 4 threads
|
||||||
@ -40,7 +40,7 @@ system_info: n_threads = 4 | AVX2 = 0 | AVX512 = 0 | NEON = 1 | FP16_VA = 1 | WA
|
|||||||
|
|
||||||
If you wish, you can submit these results here:
|
If you wish, you can submit these results here:
|
||||||
|
|
||||||
https://github.com/ggerganov/whisper.cpp/issues/89
|
https://github.com/ggml-org/whisper.cpp/issues/89
|
||||||
|
|
||||||
Please include the following information:
|
Please include the following information:
|
||||||
|
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
// Speak short text commands to the microphone.
|
// Speak short text commands to the microphone.
|
||||||
// This program will detect your voice command and convert them to text.
|
// This program will detect your voice command and convert them to text.
|
||||||
//
|
//
|
||||||
// ref: https://github.com/ggerganov/whisper.cpp/issues/171
|
// ref: https://github.com/ggml-org/whisper.cpp/issues/171
|
||||||
//
|
//
|
||||||
|
|
||||||
#include "common-sdl.h"
|
#include "common-sdl.h"
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
#
|
#
|
||||||
# Transcribe audio livestream by feeding ffmpeg output to whisper.cpp at regular intervals
|
# Transcribe audio livestream by feeding ffmpeg output to whisper.cpp at regular intervals
|
||||||
# Idea by @semiformal-net
|
# Idea by @semiformal-net
|
||||||
# ref: https://github.com/ggerganov/whisper.cpp/issues/185
|
# ref: https://github.com/ggml-org/whisper.cpp/issues/185
|
||||||
#
|
#
|
||||||
|
|
||||||
set -eo pipefail
|
set -eo pipefail
|
||||||
|
@ -2,7 +2,7 @@
|
|||||||
#
|
#
|
||||||
# Transcribe twitch.tv livestream by feeding audio input to whisper.cpp at regular intervals
|
# Transcribe twitch.tv livestream by feeding audio input to whisper.cpp at regular intervals
|
||||||
# Thanks to @keyehzy
|
# Thanks to @keyehzy
|
||||||
# ref: https://github.com/ggerganov/whisper.cpp/issues/209
|
# ref: https://github.com/ggml-org/whisper.cpp/issues/209
|
||||||
#
|
#
|
||||||
# The script currently depends on the third-party tool "streamlink"
|
# The script currently depends on the third-party tool "streamlink"
|
||||||
# On Mac OS, you can install it via "brew install streamlink"
|
# On Mac OS, you can install it via "brew install streamlink"
|
||||||
|
@ -5,7 +5,7 @@
|
|||||||
# This simple script is called by Neovim to capture audio from the microphone and transcribe it with Whisper.
|
# This simple script is called by Neovim to capture audio from the microphone and transcribe it with Whisper.
|
||||||
# In order for this to work, you need to clone the whisper.cpp repo and build the 'stream' tool
|
# In order for this to work, you need to clone the whisper.cpp repo and build the 'stream' tool
|
||||||
#
|
#
|
||||||
# git clone https://github.com/ggerganov/whisper.cpp
|
# git clone https://github.com/ggml-org/whisper.cpp
|
||||||
# cd whisper.cpp
|
# cd whisper.cpp
|
||||||
# make stream
|
# make stream
|
||||||
#
|
#
|
||||||
@ -31,7 +31,7 @@
|
|||||||
model="base.en"
|
model="base.en"
|
||||||
|
|
||||||
# export the path to the whisper.cpp repo in the WHISPER_CPP_HOME env variable
|
# export the path to the whisper.cpp repo in the WHISPER_CPP_HOME env variable
|
||||||
# https://github.com/ggerganov/whisper.cpp
|
# https://github.com/ggml-org/whisper.cpp
|
||||||
cd "${WHISPER_CPP_HOME}"
|
cd "${WHISPER_CPP_HOME}"
|
||||||
|
|
||||||
if [ ! -f ./stream ] ; then
|
if [ ! -f ./stream ] ; then
|
||||||
|
@ -30,7 +30,7 @@ Link: https://ggerganov.github.io/whisper.cpp/
|
|||||||
|
|
||||||
```bash (v3.1.2)
|
```bash (v3.1.2)
|
||||||
# build using Emscripten
|
# build using Emscripten
|
||||||
git clone https://github.com/ggerganov/whisper.cpp
|
git clone https://github.com/ggml-org/whisper.cpp
|
||||||
cd whisper.cpp
|
cd whisper.cpp
|
||||||
mkdir build-em && cd build-em
|
mkdir build-em && cd build-em
|
||||||
emcmake cmake ..
|
emcmake cmake ..
|
||||||
|
@ -25,12 +25,12 @@
|
|||||||
# SOFTWARE.
|
# SOFTWARE.
|
||||||
|
|
||||||
# Small shell script to more easily automatically download and transcribe live stream VODs.
|
# Small shell script to more easily automatically download and transcribe live stream VODs.
|
||||||
# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggerganov/whisper.cpp
|
# This uses YT-DLP, ffmpeg and the CPP version of Whisper: https://github.com/ggml-org/whisper.cpp
|
||||||
# Use `./examples/yt-wsp.sh help` to print help info.
|
# Use `./examples/yt-wsp.sh help` to print help info.
|
||||||
#
|
#
|
||||||
# Sample usage:
|
# Sample usage:
|
||||||
#
|
#
|
||||||
# git clone https://github.com/ggerganov/whisper.cpp
|
# git clone https://github.com/ggml-org/whisper.cpp
|
||||||
# cd whisper.cpp
|
# cd whisper.cpp
|
||||||
# make
|
# make
|
||||||
# ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890
|
# ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890
|
||||||
@ -44,7 +44,7 @@ SCRIPT_DIR="${SCRIPT_PATH%/*}"
|
|||||||
|
|
||||||
################################################################################
|
################################################################################
|
||||||
# Documentation on downloading models can be found in the whisper.cpp repo:
|
# Documentation on downloading models can be found in the whisper.cpp repo:
|
||||||
# https://github.com/ggerganov/whisper.cpp/#usage
|
# https://github.com/ggml-org/whisper.cpp/#usage
|
||||||
#
|
#
|
||||||
# note: unless a multilingual model is specified, WHISPER_LANG will be ignored
|
# note: unless a multilingual model is specified, WHISPER_LANG will be ignored
|
||||||
# and the video will be transcribed as if the audio were in the English language
|
# and the video will be transcribed as if the audio were in the English language
|
||||||
@ -103,10 +103,10 @@ check_requirements() {
|
|||||||
fi;
|
fi;
|
||||||
|
|
||||||
if ! command -v "${WHISPER_EXECUTABLE}" &>/dev/null; then
|
if ! command -v "${WHISPER_EXECUTABLE}" &>/dev/null; then
|
||||||
echo "The C++ implementation of Whisper is required: https://github.com/ggerganov/whisper.cpp"
|
echo "The C++ implementation of Whisper is required: https://github.com/ggml-org/whisper.cpp"
|
||||||
echo "Sample usage:";
|
echo "Sample usage:";
|
||||||
echo "";
|
echo "";
|
||||||
echo " git clone https://github.com/ggerganov/whisper.cpp";
|
echo " git clone https://github.com/ggml-org/whisper.cpp";
|
||||||
echo " cd whisper.cpp";
|
echo " cd whisper.cpp";
|
||||||
echo " make";
|
echo " make";
|
||||||
echo " ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890";
|
echo " ./examples/yt-wsp.sh https://www.youtube.com/watch?v=1234567890";
|
||||||
|
@ -24,8 +24,7 @@ You can now use it like this:
|
|||||||
|
|
||||||
`ggml` models are available from the following locations:
|
`ggml` models are available from the following locations:
|
||||||
|
|
||||||
- https://huggingface.co/ggerganov/whisper.cpp/tree/main
|
- https://huggingface.co/ggml-org/whisper.cpp/tree/main
|
||||||
- https://ggml.ggerganov.com
|
|
||||||
|
|
||||||
### 3. Convert with [convert-pt-to-ggml.py](convert-pt-to-ggml.py)
|
### 3. Convert with [convert-pt-to-ggml.py](convert-pt-to-ggml.py)
|
||||||
|
|
||||||
@ -78,7 +77,7 @@ OpenAI format. To read the HF models you can use the [convert-h5-to-ggml.py](con
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/openai/whisper
|
git clone https://github.com/openai/whisper
|
||||||
git clone https://github.com/ggerganov/whisper.cpp
|
git clone https://github.com/ggml-org/whisper.cpp
|
||||||
|
|
||||||
# clone HF fine-tuned model (this is just an example)
|
# clone HF fine-tuned model (this is just an example)
|
||||||
git clone https://huggingface.co/openai/whisper-medium
|
git clone https://huggingface.co/openai/whisper-medium
|
||||||
@ -96,7 +95,7 @@ Currently, the chunk-based transcription strategy is not implemented, so there c
|
|||||||
```bash
|
```bash
|
||||||
# clone OpenAI whisper and whisper.cpp
|
# clone OpenAI whisper and whisper.cpp
|
||||||
git clone https://github.com/openai/whisper
|
git clone https://github.com/openai/whisper
|
||||||
git clone https://github.com/ggerganov/whisper.cpp
|
git clone https://github.com/ggml-org/whisper.cpp
|
||||||
|
|
||||||
# get the models
|
# get the models
|
||||||
cd whisper.cpp/models
|
cd whisper.cpp/models
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
# Usage:
|
# Usage:
|
||||||
#
|
#
|
||||||
# git clone https://github.com/openai/whisper
|
# git clone https://github.com/openai/whisper
|
||||||
# git clone https://github.com/ggerganov/whisper.cpp
|
# git clone https://github.com/ggml-org/whisper.cpp
|
||||||
# git clone https://huggingface.co/openai/whisper-medium
|
# git clone https://huggingface.co/openai/whisper-medium
|
||||||
#
|
#
|
||||||
# python3 ./whisper.cpp/models/convert-h5-to-ggml.py ./whisper-medium/ ./whisper .
|
# python3 ./whisper.cpp/models/convert-h5-to-ggml.py ./whisper-medium/ ./whisper .
|
||||||
@ -12,7 +12,7 @@
|
|||||||
#
|
#
|
||||||
# For more info:
|
# For more info:
|
||||||
#
|
#
|
||||||
# https://github.com/ggerganov/whisper.cpp/issues/157
|
# https://github.com/ggml-org/whisper.cpp/issues/157
|
||||||
#
|
#
|
||||||
|
|
||||||
import io
|
import io
|
||||||
|
@ -5529,7 +5529,7 @@ int whisper_full_with_state(
|
|||||||
|
|
||||||
// if length of spectrogram is less than 1.0s (100 frames), then return
|
// if length of spectrogram is less than 1.0s (100 frames), then return
|
||||||
// basically don't process anything that is less than 1.0s
|
// basically don't process anything that is less than 1.0s
|
||||||
// see issue #39: https://github.com/ggerganov/whisper.cpp/issues/39
|
// see issue #39: https://github.com/ggml-org/whisper.cpp/issues/39
|
||||||
if (seek_end < seek_start + 100) {
|
if (seek_end < seek_start + 100) {
|
||||||
WHISPER_LOG_WARN("%s: input is too short - %d ms < 1000 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
|
WHISPER_LOG_WARN("%s: input is too short - %d ms < 1000 ms. consider padding the input audio with silence\n", __func__, (seek_end - seek_start)*10);
|
||||||
return 0;
|
return 0;
|
||||||
@ -6375,7 +6375,7 @@ int whisper_full_with_state(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// ref: https://github.com/ggerganov/whisper.cpp/pull/2629
|
// ref: https://github.com/ggml-org/whisper.cpp/pull/2629
|
||||||
const bool single_timestamp_ending = tokens_cur.size() > 1 &&
|
const bool single_timestamp_ending = tokens_cur.size() > 1 &&
|
||||||
tokens_cur[tokens_cur.size() - 2].id < whisper_token_beg(ctx) &&
|
tokens_cur[tokens_cur.size() - 2].id < whisper_token_beg(ctx) &&
|
||||||
tokens_cur[tokens_cur.size() - 1].id > whisper_token_beg(ctx);
|
tokens_cur[tokens_cur.size() - 1].id > whisper_token_beg(ctx);
|
||||||
|
Loading…
Reference in New Issue
Block a user