extern/whisper.cpp

Fork 1

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-15 00:12:54 +02:00

Files

.devops

.github

bindings

cmake

examples

addon.node

bench

bench.wasm

command

command.wasm

lsp

main

python

quantize

server

CMakeLists.txt

README.md

httplib.h

server.cpp

stream

stream.wasm

sycl

talk

talk-llama

talk.wasm

wchess

whisper.android

whisper.android.java

whisper.nvim

whisper.objc

whisper.swiftui

whisper.wasm

CMakeLists.txt

common-ggml.cpp

common-ggml.h

common-sdl.cpp

common-sdl.h

common.cpp

common.h

dr_wav.h

ffmpeg-transcode.cpp

generate-karaoke.sh

grammar-parser.cpp

grammar-parser.h

helpers.js

json.hpp

livestream.sh

twitch.sh

yt-wsp.sh

ggml

grammars

include

models

samples

scripts

spm-headers

src

tests

.gitignore

.gitmodules

AUTHORS

CMakeLists.txt

LICENSE

Makefile

Package.swift

README.md

README_sycl.md

History

Emmanuel Schmidbauer bec9836849 server : add inference path to make OAI API compatible (#2270 )

2024-07-08 14:24:58 +03:00

CMakeLists.txt

examples : clean up common code (#1871 )

2024-02-19 10:50:15 +02:00

httplib.h

server : add a REST Whisper server example with OAI-like API (#1380 )

2023-11-20 21:40:24 +02:00

README.md

server : fix server temperature + add temperature_inc (#1729 )

2024-01-07 13:35:14 +02:00

server.cpp

server : add inference path to make OAI API compatible (#2270 )

2024-07-08 14:24:58 +03:00

README.md

whisper.cpp http server

Simple http server. WAV Files are passed to the inference model via http requests.

https://github.com/ggerganov/whisper.cpp/assets/1991296/e983ee53-8741-4eb5-9048-afe5e4594b8f

Usage

./server -h

usage: ./bin/server [options]

options:
  -h,        --help              [default] show this help message and exit
  -t N,      --threads N         [4      ] number of threads to use during computation
  -p N,      --processors N      [1      ] number of processors to use during computation
  -ot N,     --offset-t N        [0      ] time offset in milliseconds
  -on N,     --offset-n N        [0      ] segment index offset
  -d  N,     --duration N        [0      ] duration of audio to process in milliseconds
  -mc N,     --max-context N     [-1     ] maximum number of text context tokens to store
  -ml N,     --max-len N         [0      ] maximum segment length in characters
  -sow,      --split-on-word     [false  ] split on word rather than on token
  -bo N,     --best-of N         [2      ] number of best candidates to keep
  -bs N,     --beam-size N       [-1     ] beam size for beam search
  -wt N,     --word-thold N      [0.01   ] word timestamp probability threshold
  -et N,     --entropy-thold N   [2.40   ] entropy threshold for decoder fail
  -lpt N,    --logprob-thold N   [-1.00  ] log probability threshold for decoder fail
  -debug,    --debug-mode        [false  ] enable debug mode (eg. dump log_mel)
  -tr,       --translate         [false  ] translate from source language to english
  -di,       --diarize           [false  ] stereo audio diarization
  -tdrz,     --tinydiarize       [false  ] enable tinydiarize (requires a tdrz model)
  -nf,       --no-fallback       [false  ] do not use temperature fallback while decoding
  -ps,       --print-special     [false  ] print special tokens
  -pc,       --print-colors      [false  ] print colors
  -pr,       --print-realtime    [false  ] print output in realtime
  -pp,       --print-progress    [false  ] print progress
  -nt,       --no-timestamps     [false  ] do not print timestamps
  -l LANG,   --language LANG     [en     ] spoken language ('auto' for auto-detect)
  -dl,       --detect-language   [false  ] exit after automatically detecting language
             --prompt PROMPT     [       ] initial prompt
  -m FNAME,  --model FNAME       [models/ggml-base.en.bin] model path
  -oved D,   --ov-e-device DNAME [CPU    ] the OpenVINO device used for encode inference
  --host HOST,                   [127.0.0.1] Hostname/ip-adress for the server
  --port PORT,                   [8080   ] Port number for the server
  --convert,                     [false  ] Convert audio to WAV, requires ffmpeg on the server

Warning

Do not run the server example with administrative privileges and ensure it's operated in a sandbox environment, especially since it involves risky operations like accepting user file uploads and using ffmpeg for format conversions. Always validate and sanitize inputs to guard against potential security threats.

request examples

/inference

curl 127.0.0.1:8080/inference \
-H "Content-Type: multipart/form-data" \
-F file="@<file-path>" \
-F temperature="0.0" \
-F temperature_inc="0.2" \
-F response_format="json"

/load

curl 127.0.0.1:8080/load \
-H "Content-Type: multipart/form-data" \
-F model="<path-to-model-file>"