models : change default hosting to Hugging Face

My Linode is running out of monthly bandwidth due to the big interest in
the project
This commit is contained in:
Georgi Gerganov 2022-11-15 19:47:06 +02:00
parent 83c742f1a7
commit 864a78a8d0
No known key found for this signature in database
GPG Key ID: 449E073F9DC10735
4 changed files with 24 additions and 12 deletions

View File

@ -428,11 +428,14 @@ The original models are converted to a custom binary format. This allows to pack
- vocabulary - vocabulary
- weights - weights
You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script or from here: You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script
or manually from here:
https://ggml.ggerganov.com - https://huggingface.co/datasets/ggerganov/whisper.cpp
- https://ggml.ggerganov.com
For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README in [models](models). For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README
in [models](models).
## Bindings ## Bindings

View File

@ -1,10 +1,13 @@
## Whisper model files in custom ggml format ## Whisper model files in custom ggml format
The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27) The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27)
have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed using the have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed
[convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate the `ggml` files using the [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate
yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh) script to download the the `ggml` files yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh)
already converted models from https://ggml.ggerganov.com script to download the already converted models. Currently, they are hosted on the following locations:
- https://huggingface.co/datasets/ggerganov/whisper.cpp
- https://ggml.ggerganov.com
Sample usage: Sample usage:

View File

@ -18,7 +18,7 @@ if %argc% neq 1 (
set model=%1 set model=%1
for %%b in (%models%) do ( for %%b in (%models%) do (
if "%%b"=="%model%" ( if "%%b"=="%model%" (
CALL :download_model CALL :download_model
goto :eof goto :eof
@ -41,7 +41,7 @@ if exist "ggml-%model%.bin" (
PowerShell -NoProfile -ExecutionPolicy Bypass -Command "Invoke-WebRequest -Uri https://ggml.ggerganov.com/ggml-model-whisper-%model%.bin -OutFile ggml-%model%.bin" PowerShell -NoProfile -ExecutionPolicy Bypass -Command "Invoke-WebRequest -Uri https://ggml.ggerganov.com/ggml-model-whisper-%model%.bin -OutFile ggml-%model%.bin"
if %ERRORLEVEL% neq 0 ( if %ERRORLEVEL% neq 0 (
echo Failed to download ggml model %model% echo Failed to download ggml model %model%
echo Please try again later or download the original Whisper model files and convert them yourself. echo Please try again later or download the original Whisper model files and convert them yourself.
goto :eof goto :eof

View File

@ -3,6 +3,12 @@
# This script downloads Whisper model files that have already been converted to ggml format. # This script downloads Whisper model files that have already been converted to ggml format.
# This way you don't have to convert them yourself. # This way you don't have to convert them yourself.
#src="https://ggml.ggerganov.com"
#pfx="ggml-model-whisper"
src="https://huggingface.co/datasets/ggerganov/whisper.cpp"
pfx="resolve/main/ggml"
# get the path of this script # get the path of this script
function get_script_path() { function get_script_path() {
if [ -x "$(command -v realpath)" ]; then if [ -x "$(command -v realpath)" ]; then
@ -46,7 +52,7 @@ fi
# download ggml model # download ggml model
printf "Downloading ggml model $model ...\n" printf "Downloading ggml model $model from '$src' ...\n"
cd $models_path cd $models_path
@ -56,9 +62,9 @@ if [ -f "ggml-$model.bin" ]; then
fi fi
if [ -x "$(command -v wget)" ]; then if [ -x "$(command -v wget)" ]; then
wget --quiet --show-progress -O ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin wget --quiet --show-progress -O ggml-$model.bin $src/$pfx-$model.bin
elif [ -x "$(command -v curl)" ]; then elif [ -x "$(command -v curl)" ]; then
curl --output ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin curl --output ggml-$model.bin $src/$pfx-$model.bin
else else
printf "Either wget or curl is required to download models.\n" printf "Either wget or curl is required to download models.\n"
exit 1 exit 1