mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-07-12 22:15:09 +02:00

Files

Daniel Bevenius 30cf30ca82 examples : reduce initial memory to 512MB (#2939 )

* examples : reduce initial memory to 512MB

This commit reduces the initial memory size to 512MB. This is done to
to avoid WebAssembly memory allocation issues on some platforms. It also
adds a flag to allow the memory to grow dynamically (up to the maximum).

The motivation for this change is that currently the initial memory is
set to 2GB which might be to large for some platforms. This will lead to
an error being thrown from the JavaScript code generated by Emscripten
when trying to allocate memory. More details can be found in the
referenced issue below.


* examples : set MAXIMUM_MEMORY instead of TOTAL_MEMORY

This commit sets MAXIMUM_MEMORY instead of TOTAL_MEMORY in the
whisper.wasm example.

The motivation for this is that TOTAL_MEMORY and INITIAL_MEMORY are
actually the same thing. Instead we want to set MAXIMUM_MEMORY to
2GB. 

Refs: https://github.com/ggerganov/whisper.cpp/issues/2920
Refs: https://emscripten.org/docs/tools_reference/settings_reference.html#initial-memory

2025-03-24 14:42:12 +01:00

CMakeLists.txt

examples : reduce initial memory to 512MB (#2939 )

2025-03-24 14:42:12 +01:00

emscripten.cpp

whisper : add context param to disable gpu (#1293 )

2023-11-06 11:04:24 +02:00

index-tmpl.html

examples : fix nthread parsing in whisper.wasm (#2938 )

2025-03-24 14:40:00 +01:00

README.md

examples : update wasm examples to include server.py [no ci] (#2908 )

2025-03-20 09:07:43 +01:00

README.md

whisper.wasm

Inference of OpenAI's Whisper ASR model inside the browser

This example uses a WebAssembly (WASM) port of the whisper.cpp implementation of the transformer to run the inference inside a web page. The audio data does not leave your computer - it is processed locally on your machine. The performance is not great but you should be able to achieve x2 or x3 real-time for the tiny and base models on a modern CPU and browser (i.e. transcribe a 60 seconds audio in about ~20-30 seconds).

This WASM port utilizes WASM SIMD 128-bit intrinsics so you have to make sure that your browser supports them.

The example is capable of running all models up to size small inclusive. Beyond that, the memory requirements and performance are unsatisfactory. The implementation currently support only the Greedy sampling strategy. Both transcription and translation are supported.

Since the model data is quite big (74MB for the tiny model) you need to manually load the model into the web-page.

The example supports both loading audio from a file and recording audio from the microphone. The maximum length of the audio is limited to 120 seconds.

Live demo

Link: https://whisper.ggerganov.com

Build instructions

# build using Emscripten
git clone https://github.com/ggerganov/whisper.cpp
cd whisper.cpp
mkdir build-em && cd build-em
emcmake cmake ..
make -j

The example can then be started by running a local HTTP server:

python3 examples/server.py

And then opening a browser to the following URL: http://localhost:8000/whisper.wasm

To run the example in a different server, you need to copy the following files to the server's HTTP path:

# copy the produced page to your HTTP path
cp bin/whisper.wasm/*    /path/to/html/
cp bin/libmain.worker.js /path/to/html/