* whisper : improve whisper-cli executable path detection in model download shell scripts
If whisper-cli is found on the path, do not suggest invoking from build directory. This improves flexibility and usability for distribution and packaging scenarios.
* whisper : enhance Windows model download batch script to have comparable functionality and behaviour as shell scripts
* Download models to the current directory if the script is executed from the \bin\ directory (for future distribution scenarios where the script is in the \bin\ subdirectory of a Windows build)
* Add model_path command line argument
* If whisper-cli is found on the path, do not suggest invoking from build directory
* whisper : resolve compiler warning by removing duplicate definition of NOMINMAX in whisper-cli code
This change ensures that when the script is packaged and distributed, models are downloaded to the current directory instead of the script's location, preventing conflicts with system directories. This improves flexibility and usability for distribution and packaging scenarios.
This commit updates the recommended version of Python to 3.11 for Core
ML conversion support. It also adds the `-e` flag to the
`generate-coreml-model.sh` script to ensure that the script exits on the
first error.
The motivation for this that when following the installation instructions
using Python 3.10 I get the following error:
```console
(venv) $ ./models/generate-coreml-model.sh base.en
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.1.3 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "/whisper-work/models/convert-whisper-to-coreml.py", line 2, in <module>
import torch
File "/whisper-work/venv/lib/python3.10/site-packages/torch/__init__.py", line 870, in <module>
from . import _masked
File "/whisper-work/venv/lib/python3.10/site-packages/torch/_masked/__init__.py", line 420, in <module>
def sum(input: Tensor,
File "/whisper-work/venv/lib/python3.10/site-packages/torch/_masked/__init__.py", line 223, in _apply_docstring_templates
example_input = torch.tensor([[-3, -2, -1], [0, 1, 2]])
/whisper-work/venv/lib/python3.10/site-packages/torch/_masked/__init__.py:223: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /Users/distiller/project/pytorch/torch/csrc/utils/tensor_numpy.cpp:68.)
example_input = torch.tensor([[-3, -2, -1], [0, 1, 2]])
Minimum required torch version for importing coremltools.optimize.torch is 2.1.0. Got torch version 1.11.0.
Traceback (most recent call last):
File "/whisper-work/models/convert-whisper-to-coreml.py", line 4, in <module>
import coremltools as ct
File "/whisper-work/venv/lib/python3.10/site-packages/coremltools/__init__.py", line 120, in <module>
from . import converters, models, optimize, proto
File "/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/__init__.py", line 7, in <module>
from . import libsvm, sklearn, xgboost
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/xgboost/__init__.py", line 6, in <module>
from ._tree import convert
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/xgboost/_tree.py", line 9, in <module>
from ._tree_ensemble import convert_tree_ensemble as _convert_tree_ensemble
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/xgboost/_tree_ensemble.py", line 11, in <module>
from ...models.tree_ensemble import TreeEnsembleClassifier
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/models/__init__.py", line 6, in <module>
from . import (
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/models/ml_program/__init__.py", line 6, in <module>
from . import compression_utils
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/models/ml_program/compression_utils.py", line 8, in <module>
from coremltools.converters.mil.mil import Operation as _Operation
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/__init__.py", line 7, in <module>
from .frontend.tensorflow.tf_op_registry import register_tf_op
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/frontend/__init__.py", line 6, in <module>
from . import tensorflow, tensorflow2, torch
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/__init__.py", line 11, in <module>
from . import ops, quantization_ops
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/ops.py", line 36, in <module>
from .internal_graph import InternalTorchIRGraph, InternalTorchIRNode
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/internal_graph.py", line 15, in <module>
from .exir_utils import extract_io_from_exir_program
File "/Users/danbev/work/ai/whisper-work/venv/lib/python3.10/site-packages/coremltools/converters/mil/frontend/torch/exir_utils.py", line 99, in <module>
) -> Dict[str, torch.fx.Node]:
AttributeError: module 'torch' has no attribute 'fx'
```
Using Python3.11 the conversion script runs without any errors.
* Updated models download URL
* Updated list of models available
All of the high efficiency quantized models are rejected when trying to download. They exist on the server. Let's allow them.
* added path prefix for whisper-cli in message to user. The message is misleading if this script is called from another script in a different folder. So the message has to be fixed.
* undid download URL change I made earlier. Fixed filepath.Join(urlPath, model) bug.
* Undid download URL change I made earlier.
Seems that the old URL works but only when provided a model to download. Still doesn't explain why there's a different download URL that also works. Please elucidate in docs.
* Fixed URLForModel Function's bug
filepath.Join is designed for filesystem paths, and it uses backslashes (\) on Windows. URLs, however, require forward slashes (/), so the use of filepath.Join is inappropriate for constructing URLs.
The fmt.Sprintf function ensures that forward slashes are used.
* Fixed URL trailing / double slash bug
Ensure no double slash by trimming trailing '/' from srcUrl if present
* Fixed bad download URL, missing ggml prefix
Not sure if that was a bug I introduced but it was trying to download without the prefix.
* Added question before downloading all models. Added download size estimate
HEAD Requests:
Efficiently fetches file sizes without downloading the content.
Interactive Workflow:
Allows the user to make informed decisions about downloading all models.
Safe Defaults:
Aborts if the user does not explicitly confirm.
* Fixed Unbuffered channel warning.
warning in context.go : misuse of unbuffered os.Signal channel as argument to signal.
The warning indicates that the unbuffered channel used in signal.Notify in context.go may be misused. In Go, unbuffered channels can cause potential deadlocks if signals are sent faster than they are received.
* Fixed download size calculation, download URL prefix bug, added link to models URL for user.
The URL formatter was prepending the model name to the formatted model name in the URL
* Added logs and exes to gitignore
* Delete bindings/go/examples/go-model-download/go-model-download.exe
* Delete whisper_build.log
Fix issue: Conversion from Whisper to OpenVino failed #1870
convert-whisper-to-openvino.py stopped working with OpenVINO version 2023.0.0-10926-b4452d56304-releases/2023/0 .
Error was: TypeError: load(): incompatible function arguments. The following argument types are supported:
1. (self: openvino._pyopenvino.FrontEnd, path: object) -> ov::frontend::InputModel
Tested successfully with a large-v3 conversion.
Co-authored-by: Stefan Grundmann <grundmanns@sandiego.gov>
* build: add dockerfile for ci
* ci: add action to build/push docker image
* fix: lowercase repository to fix ci
* ci: update cuBLAS flag
* build: install curl and ffmped in image
* docs: add docker section
* fix: improve args check when download model
* add HuggingFace mirror to download ggml model
* support tdrz via simple hack overriding solm tokens
* fix incorrect translate/transcribe token_ids that are not static const
* add apollo 13 sample for tdrz demo
* render [SPEAKER TURN] consistently in all terminal output using vocab.id_to_token
* extend whisper_segment with speaker_turn_next field and save in json output
* fix failing go build
* slipped in some python syntax whoops
* whisper : finalize tinydiarize support (add flag + fixes)
* whisper : tdrz support for word-level timestamps (respect max_len)
* java : try to fix tests after adding tdrz_enable flag
* main : remove TODO leftover
* java : fix params order list after adding "tdrz_enable"
* whisper : fix solm and add nosp token
* main : print tinydiarize help
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit modifies the `get_script_path` function to correctly handle
spaces and special characters in directory paths. The fix involves adding
double quotes around variables and commands where needed to ensure proper
parsing of paths with spaces and special characters.