* Fix signature of URI.new7s return value
* Use path instead of string | _ToPath
* Add document comment to RBS
* Remove unnecessary build flags
* Remove unnecessary line
* Remove files have become unnecessary
* Make gem install accept build options for whisper.cpp
* Add instraction for build options in README
* Add methods for check to Options
* Test build options
* Rename: configs -> options
* Add assert_installed assertion
* Use assert_installed
* Remove unused attribute
* Extract dependency check logic as Dependencies class
* Update README
* Add WHISPER_FFMPEG option
* Test extra build options only on local test
* Bump version to 1.3.2 [skip ci]
FFmpeg introduced a new channel layout API that uses `AVChannelLayout`
interface in v6.0. It subsequently dropped the old bitmask-based API
in v7.0.
This updates decode_audio() to support the new channel layout API,
so that we can compile `whisper-cli` and `whisper-server` with FFmpeg
v7.0 or later.
Tested on on Ubuntu 24.10 with FFmpeg v7.0.2.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
* Use CMake to build shared object
* Make Rakefile follow change of build process
* Add test for packaging
* Run CI for Ruby bindings almost always
because each CMakeLists.txt might affect Ruby bindings
* Enable PIC
* Bump Ruby version to 3.2 on CI
* Check libgomp
* Check dependency of whisper.cpp accurately
FFmpeg integration was introduced in 1b51fdf by William Tambellini,
but not mentioned in the main documentation.
Add a short guide on how to enable the feature. Confirmed to work
on both Ubuntu 24.04 and Fedora 39.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
This commit adds a check for the visionos build version used with vtool
in build-xcframework.sh. The script now checks the Xcode version and
determines whether to use "xros" or "visionos" for the build version.
This commit also uses xcrun for the vtool so that the version of vtool
in xcode command line tools is used instead of the one in the system
path.
Refs: https://github.com/ggml-org/whisper.cpp/pull/2994#issuecomment-2773292223
* tests : add script to benchmark whisper.cpp on LibriSpeech corpus
LibriSpeech is a widely-used benchmark dataset for training and
testing speech recognition models.
This adds a set of scripts to measure the recognition accuracy of
whisper.cpp models, following the common benchmark standards.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
* Document how to prepare `whisper-cli` and model files
Feedback from Daniel Bevenius.
This adds a short code example how to prepare the `whisper-cli`
command, to make the initial setup step a little bit clearer.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
* tests : Simplify how to set up Python environment
Based on a feedback from Georgi Gerganov.
Instead of setting up a virtual environment in Makefile, let users
set up the Python environment. This is better since users may have
their own preferred workflow/toolkit.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
---------
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
The benchmark script 'scripts/bench-all.sh' assumes that the 11th
field of the output line is a timestamp. This assumption does not
hold when the target model takes a bit longer to process.
Fix this issue by introducing an explicit whitespace to the output
lines of `whisper_print_timings()`.
Signed-off-by: Fujimoto Seiji <fujimoto@ceptord.net>
This commit updates examples/server.py which is used to serve the wasm
examples locally. The changes include:
- Added a redirect from the root URL to /whisper.cpp.
So now accessing http://localhost:8000/ will redirect to
http://localhost:8000/whisper.cpp/ which matches the url for the app
deployed to github pages.
- Custom handling for coi-serviceworker.js to serve it to avoid
and error in the console. This file is not strictly necessary
for the local server to work as the headers are provided already but
it is nice to not have an error in the console.
- Fixed the shutdown of the server to ensure it exits cleanly
on Ctrl+C. Previously it would continue to hang onto the port even
after the processed had exited.
* whisper.wasm : fix unknown language issue
This commit addresses an issue with whisper.wasm where the following
error was being displayed when running the application in github pages:
```
whisper_lang_id: unknown language 'д=␙c'
```
This turned out to be a memory corruption issue and further details
can be found in the reference issue below.
Refs: https://github.com/ggerganov/whisper.cpp/issues/2998
* cpu: refactor SIMD mappings and vectorized op functions into separate files
* Fix warning for ggml_float to float
* Fix warnings
* cpu: move all the operations (except mul_mat) to a separate c++ file
* fix whitespace
* Update ggml/src/ggml-cpu/vec.h
Co-authored-by: Diego Devesa <slarengh@gmail.com>
* Fix PR comments - use GGML_UNUSED, use cassert in ops.cpp
* Reverse the order of import for ops.h and vec.h, to match what was present in ggml-cpu.c previously
---------
Co-authored-by: Diego Devesa <slarengh@gmail.com>
This adds a section to the README.md file that describes how to use the
XCFramework.
The modification for this is that is not obvious how to use the
XCFramework and and example will help.
One thing to note is that the example is using the latest release
including the checksum. We are thinking about how we might automate
this in the future but for now this is a good start.
* Rename oneMKL Interface to oneMath
* Use oneMath for Intel vendor
* Rename occurences to mkl
* clang-format
* Silence verbose warnings
* Set oneMath HIP_TARGETS
* Fix silence warnings
* Remove step to build oneMath from build instructions
* Use fixed oneMath version
* Remove INTEL_CPU
* Fold CMake oneDNN conditions
* Use Intel oneMKL for Intel devices
* Improve CMake message
* Link against MKL::MKL_SYCL::BLAS only
* Move oneMath documentation to Nvidia and AMD sections
This commit removes test-whisper-cli-tiny-en from the gh label.
The motivation for this change is that until recently the tests were
disabled. But now that they are enabled some of the tests, specifically
the ci jobs that use sanatizers (e.g. thread-sanitizer) take a long time
to run as they are instrumented.
Some of these jobs also have matricies which means that there are
multiple jobs are created that all run these tests.
The suggestion here is to limit the number of tests that are run in the
ci jobs so cut down the CI build time.
* coreml: fix Whisper to CoreML conversion by disabling SDPA
This commit disables the use of PyTorch's
`scaled_dot_product_attention` in the Whisper model to avoid
compatibility issues during CoreML conversion.
The issue occurs because coremltools requires PyTorch 2.5.0, but the
Whisper implementation may expect behavior from newer PyTorch versions.
By setting `MultiHeadAttention.use_sdpa = False`, we force Whisper to
use its fallback manual attention implementation, which works correctly
with PyTorch 2.5.0 during the tracing process.
Refs: https://github.com/ggerganov/whisper.cpp/issues/2783
* coreml: fix audio shape in whisper decoder conversion
This commit fixes the audio shape in the whisper decoder conversion
script.
The motivation for this is that the audio shape was incorrect and
was causing the conversion to fail.
* coreml : set -e in generate-coreml-interface.sh
The commit sets the -e flag in the generate-coreml-interface.sh script
to make sure the script fails if any command fails.
* coreml : update generated encoder/decoder interfaces
This commit updates the generated encoder/decoder interfaces for the
whisper model which is the result of running the
generate-coreml-interface.sh script.
* ci : add coreml job that converts base.en to coreml [no ci]
This commit adds a new job to the CI pipeline that downloads the base.en
model and converts it to CoreML format. The CoreML model is then packed
into a zip file and uploaded as an artifact.
This will only be done for pushes to master, releases, or pre-releases.
Refs: https://github.com/ggerganov/whisper.cpp/issues/2783
* coreml : remove publishing of coreml model
* ci : add GGML_OPENMP=OFF to ubuntu-22-gcc-sanitized
This commit re-enables the tests in the build process which are
currently commented out.
It is possible to build the tests using `-DWHISPER_BUILD_TESTS=ON` and
then run a single test using:
```console
$ ctest -R test-whisper-cli-tiny.en --test-dir build
Internal ctest changing into directory: /home/danbev/work/ai/whisper-work/build
Test project /home/danbev/work/ai/whisper-work/build
Start 2: test-whisper-cli-tiny.en
1/1 Test #2: test-whisper-cli-tiny.en ......... Passed 4.44 sec
100% tests passed, 0 tests failed out of 1
Label Time Summary:
en = 4.44 sec*proc (1 test)
gh = 4.44 sec*proc (1 test)
tiny = 4.44 sec*proc (1 test)
Total Test time (real) = 4.44 sec
```
Some of the tests take a long time to run so it might not be a good idea
to enable them in CI, or perhaps we could only run a subset of the tests
in CI.
This commit re-enables the android_java job in the CI workflow. The job
was disabled because of a failing build.
The motivation for this is that Commit
226d344f565ea6140e7c6a583bc300a64454af58 ("whisper.android.java : update
build with ggml source changes") addressed build issues and it should
now be possible to re-enable this job.