diff --git a/README.md b/README.md index bef03fcd..24fcc2ac 100644 --- a/README.md +++ b/README.md @@ -428,11 +428,14 @@ The original models are converted to a custom binary format. This allows to pack - vocabulary - weights -You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script or from here: +You can download the converted models using the [models/download-ggml-model.sh](models/download-ggml-model.sh) script +or manually from here: -https://ggml.ggerganov.com +- https://huggingface.co/datasets/ggerganov/whisper.cpp +- https://ggml.ggerganov.com -For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README in [models](models). +For more details, see the conversion script [models/convert-pt-to-ggml.py](models/convert-pt-to-ggml.py) or the README +in [models](models). ## Bindings diff --git a/models/README.md b/models/README.md index ed82da70..26353018 100644 --- a/models/README.md +++ b/models/README.md @@ -1,10 +1,13 @@ ## Whisper model files in custom ggml format The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27) -have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed using the -[convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate the `ggml` files -yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh) script to download the -already converted models from https://ggml.ggerganov.com +have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed +using the [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate +the `ggml` files yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh) +script to download the already converted models. Currently, they are hosted on the following locations: + +- https://huggingface.co/datasets/ggerganov/whisper.cpp +- https://ggml.ggerganov.com Sample usage: diff --git a/models/download-ggml-model.cmd b/models/download-ggml-model.cmd index bc64e2ac..52fde946 100644 --- a/models/download-ggml-model.cmd +++ b/models/download-ggml-model.cmd @@ -18,7 +18,7 @@ if %argc% neq 1 ( set model=%1 -for %%b in (%models%) do ( +for %%b in (%models%) do ( if "%%b"=="%model%" ( CALL :download_model goto :eof @@ -41,7 +41,7 @@ if exist "ggml-%model%.bin" ( PowerShell -NoProfile -ExecutionPolicy Bypass -Command "Invoke-WebRequest -Uri https://ggml.ggerganov.com/ggml-model-whisper-%model%.bin -OutFile ggml-%model%.bin" -if %ERRORLEVEL% neq 0 ( +if %ERRORLEVEL% neq 0 ( echo Failed to download ggml model %model% echo Please try again later or download the original Whisper model files and convert them yourself. goto :eof diff --git a/models/download-ggml-model.sh b/models/download-ggml-model.sh index 10a15c95..3eef87ec 100755 --- a/models/download-ggml-model.sh +++ b/models/download-ggml-model.sh @@ -3,6 +3,12 @@ # This script downloads Whisper model files that have already been converted to ggml format. # This way you don't have to convert them yourself. +#src="https://ggml.ggerganov.com" +#pfx="ggml-model-whisper" + +src="https://huggingface.co/datasets/ggerganov/whisper.cpp" +pfx="resolve/main/ggml" + # get the path of this script function get_script_path() { if [ -x "$(command -v realpath)" ]; then @@ -46,7 +52,7 @@ fi # download ggml model -printf "Downloading ggml model $model ...\n" +printf "Downloading ggml model $model from '$src' ...\n" cd $models_path @@ -56,9 +62,9 @@ if [ -f "ggml-$model.bin" ]; then fi if [ -x "$(command -v wget)" ]; then - wget --quiet --show-progress -O ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin + wget --quiet --show-progress -O ggml-$model.bin $src/$pfx-$model.bin elif [ -x "$(command -v curl)" ]; then - curl --output ggml-$model.bin https://ggml.ggerganov.com/ggml-model-whisper-$model.bin + curl --output ggml-$model.bin $src/$pfx-$model.bin else printf "Either wget or curl is required to download models.\n" exit 1