mirror of https://github.com/ggerganov/whisper.cpp.git synced 2024-11-07 16:44:13 +01:00

History

Georgi Gerganov f96e1c5b78 sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 ) * sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) * metal : allow env metal variable to override resource path (#1415) * Allow env variable to override resource path * Update ggml-metal.m --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * sync : restore common / main from `master` * sync : restore whisper from `master` * talk-llama : update to latest llama.cpp * ruby : fix build * ggml : fix 32-bit ARM build * ggml : fix MIN / MAX macro collisions + update ios bindings * ggml : fix ifdefs and MIN / MAX again * exampels : fix Obj-C and Swift examples * ggml : fix 32-bit ARM compatibility * ggml : one more attempt to fix 32-bit ARM compat * whisper : fix support for larger graphs --------- Co-authored-by: Chris Raethke <codesoda@users.noreply.github.com>		2023-11-03 21:35:05 +02:00
..
whisper.objc	whisper : add loader class to allow loading from buffer and others (#353 )	2023-01-08 13:03:33 +02:00
whisper.objc.xcodeproj	sync : ggml (backend v2, k-quants, CUDA opts, Metal opts, etc.) (#1422 )	2023-11-03 21:35:05 +02:00
README.md	whisper : Metal and ggml-alloc support (#1270 )	2023-09-15 12:18:18 +03:00

README.md

whisper.objc

Minimal Obj-C application for automatic offline speech recognition. The inference runs locally, on-device.

https://user-images.githubusercontent.com/1991296/197385372-962a6dea-bca1-4d50-bf96-1d8c27b98c81.mp4

Real-time transcription demo:

https://user-images.githubusercontent.com/1991296/204126266-ce4177c6-6eca-4bd9-bca8-0e46d9da2364.mp4

Usage

git clone https://github.com/ggerganov/whisper.cpp
open whisper.cpp/examples/whisper.objc/whisper.objc.xcodeproj/

// If you don't want to convert a Core ML model, you can skip this step by create dummy model
mkdir models/ggml-base.en-encoder.mlmodelc

Make sure to build the project in Release:

Also, don't forget to add the -DGGML_USE_ACCELERATE compiler flag for ggml.c in Build Phases. This can significantly improve the performance of the transcription:

Core ML

If you want to enable Core ML support, you can add the -DWHISPER_USE_COREML -DWHISPER_COREML_ALLOW_FALLBACK compiler flag for whisper.cpp in Build Phases:

Then follow the Core ML support section of readme for convert the model.

In this project, it also added -O3 -DNDEBUG to Other C Flags, but adding flags to app proj is not ideal in real world (applies to all C/C++ files), consider splitting xcodeproj in workspace in your own project.

Metal

You can also enable Metal to make the inference run on the GPU of your device. This might or might not be more efficient compared to Core ML depending on the model and device that you use.

To enable Metal, just add -DGGML_USE_METAL instead off the -DWHISPER_USE_COREML flag and you are ready. This will make both the Encoder and the Decoder run on the GPU.

If you want to run the Encoder with Core ML and the Decoder with Metal then simply add both -DWHISPER_USE_COREML -DGGML_USE_METAL flags. That's all!