mirror of
https://github.com/ggerganov/whisper.cpp.git
synced 2025-08-17 14:52:00 +02:00
Update README.md and finalize the whisper.wasm example
This commit is contained in:
@ -1,3 +1,27 @@
|
||||
# whisper.wasm
|
||||
|
||||
Live demo: https://whisper.ggerganov.com
|
||||
Inference of [OpenAI's Whisper ASR model](https://github.com/openai/whisper) inside the browser
|
||||
|
||||
This example uses a WebAssembly (WASM) port of the [whisper.cpp](https://github.com/ggerganov/whisper.cpp)
|
||||
implementation of the transformer to run the inference inside a web page. The audio data does not leave your computer -
|
||||
it is processed locally on your machine. The performance is not great but you should be able to achieve x2 or x3
|
||||
real-time for the `tiny` and `base` models on a modern CPU and browser (i.e. transcribe a 60 seconds audio in about
|
||||
~20-30 seconds).
|
||||
|
||||
This WASM port utilizes [WASM SIMD 128-bit intrinsics](https://emcc.zcopy.site/docs/porting/simd/) so you have to make
|
||||
sure that [your browser supports them](https://webassembly.org/roadmap/).
|
||||
|
||||
The example is capable of running all models up to size `small` inclusive. Beyond that, the memory requirements and
|
||||
performance are unsatisfactory. The implementation currently support only the `Greedy` sampling strategy. Both
|
||||
transcription and translation are supported.
|
||||
|
||||
Since the model data is quite big (74MB for the `tiny` model) you need to manually load the model into the web-page.
|
||||
|
||||
The example supports both loading audio from a file and recording audio from the microphone. The maximum length of the
|
||||
audio is limited to 120 seconds.
|
||||
|
||||
## Live demo
|
||||
|
||||
Link: https://whisper.ggerganov.com
|
||||
|
||||

|
||||
|
@ -162,7 +162,7 @@
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<br><br>
|
||||
<br>
|
||||
|
||||
<!-- textarea with height filling the rest of the page -->
|
||||
<textarea id="output" rows="20"></textarea>
|
||||
@ -254,6 +254,10 @@
|
||||
return new type(buffer);
|
||||
}
|
||||
|
||||
//
|
||||
// load model
|
||||
//
|
||||
|
||||
function loadFile(event, fname) {
|
||||
var file = event.target.files[0] || null;
|
||||
if (file == null) {
|
||||
@ -281,6 +285,10 @@
|
||||
reader.readAsArrayBuffer(file);
|
||||
}
|
||||
|
||||
//
|
||||
// audio file
|
||||
//
|
||||
|
||||
function loadAudio(event) {
|
||||
if (!context) {
|
||||
context = new AudioContext({sampleRate: 16000});
|
||||
@ -327,7 +335,7 @@
|
||||
}
|
||||
|
||||
//
|
||||
// Microphone
|
||||
// microphone
|
||||
//
|
||||
|
||||
var mediaRecorder = null;
|
||||
|
Reference in New Issue
Block a user