whisper.cpp

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-18 05:20:18 +02:00

Author	SHA1	Message	Date
Justine Tunney	7f78675008	examples : use colorblind friendly TTY color scheme (#2360 ) This change updates the -pc flag, so that a new xterm256 color scheme is used. This color scheme is believed to be better for three reasons: 1. It should be friendlier to the colorblind. The scheme was designed by Paul Tol (see: https://personal.sron.nl/~pault/). TensorBoard uses it since 2017, so it's already popular in the machine learning community 2. It should appear to be the same colors as before to people who aren't i.e. it's still a red-green spectrum like before but lightly modified 3. It is readable in both white and black background terminals. The neon colors before were probably a bit too intense for white backgrounds.	2024-08-20 10:49:10 +03:00
Georgi Gerganov	58323bf8ed	build : fix aarch64 (#0 )	2024-08-08 22:48:46 +03:00
Georgi Gerganov	22058f2dbc	talk-llama : sync llama.cpp	2024-08-08 22:48:46 +03:00
Georgi Gerganov	c7ea4fd235	common : handle new quant types (ggml/0)	2024-08-08 22:48:46 +03:00
Georgi Gerganov	dbf9c15e30	talk-llama : sync llama.cpp	2024-07-08 14:53:55 +03:00
Georgi Gerganov	d3f6c34976	examples : fix compile warnings [no ci] (#0 )	2024-07-08 14:53:55 +03:00
Emmanuel Schmidbauer	bec9836849	server : add inference path to make OAI API compatible (#2270 )	2024-07-08 14:24:58 +03:00
Georgi Gerganov	4a62efbb95	cmake : minor fixes	2024-06-26 21:42:39 +03:00
Georgi Gerganov	dc8cc2dd6f	whisper : disable CUDA mel + fix FFMPEG	2024-06-26 20:11:38 +03:00
Georgi Gerganov	e30c679928	whisper : reorganize source code + improve CMake (#2256 ) * scripts : update sync [no ci] * files : reorganize [no ci] * sync : llama.cpp * cmake : link math library * cmake : build normal ggml library * files : move headers to include * objc : fix path to ggml-metal.h * ci : fix WHISPER_CUDA -> GGML_CUDA * scripts : sync LICENSE [no ci]	2024-06-26 19:34:09 +03:00
Georgi Gerganov	e293f17d34	talk-llama : sync llama.cpp	2024-06-18 09:45:37 +03:00
slaren	de29b193f6	move BLAS to a separate backend (cont) (llama/6210) ggml-ci	2024-06-18 09:39:40 +03:00
Georgi Gerganov	3b1ac03828	ggml : remove OpenCL (#0 )	2024-06-16 18:19:48 +03:00
Georgi Gerganov	061eeb9f61	talk-llama : sync llama.cpp	2024-06-16 18:19:48 +03:00
Borislav Stanimirov	af5833e298	whisper : remove `speed_up` and `phase_vocoder` functions (#2198 ) whisper : fix cast warning * whisper : remove phase_vocoder functions, ref #2195 * whisper : remove speed_up from whisper_full_params, closes #2195	2024-05-31 11:37:29 +03:00
Daniel Valdivia	a7dc2aab16	server : fix typo (#2181 ) A simple comment typo, PR can be dismissed	2024-05-25 10:46:22 +03:00
William Tambellini	1b51fdf170	examples : add support for decoding input with ffmpeg (Linux) (#2133 ) - search for ffmpeg libs/headers at cmake time - added ffmpeg-transcode.cpp into libcommon if ffmpeg on - hooked ffmpeg trancoding in common read_wav(...) - passed test: ./main -m ggml-base.en.bin -f samples/jfk.mp3	2024-05-21 18:31:41 +03:00
Pedro Probst	adee3f9c1f	node : add flash_attn param (#2170 )	2024-05-20 09:08:48 +03:00
Georgi Gerganov	7094ea5e75	whisper : use flash attention (#2152 ) * whisper : use flash attention in the encoder * whisper : add kv_pad * whisper : remove extra backend instance (huh?) * whisper : use FA for cross-attention * whisper : use FA for self-attention * whisper : simplify encoder FA * whisper : add flash_attn runtime parameter * scripts : add bench log * scripts : add M1 Pro bench log	2024-05-15 09:38:19 +03:00
petterreinholdtsen	9d5771ae43	talk-llama : reject runs without required arguments (#2153 ) * Extended talk-llama example to reject runs without required arguments. Print warning and exit if models are not specified on the command line. * Update examples/talk-llama/talk-llama.cpp * Update examples/talk-llama/talk-llama.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-05-14 21:32:41 +03:00
Georgi Gerganov	4ef8d9f44e	server : return utf-8 (#2138 )	2024-05-13 15:33:46 +03:00
Pedro Probst	3928dbd206	node : add audio_ctx and audio buffer params (#2123 ) * node : add audio_ctx param * node : support passing audio buffer directly * node : parse audio_ctx in index.js --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-05-13 15:22:23 +03:00
valVk	30f73109b8	node : add additional params (#2000 ) * Add additional params to addon.node * Add comma_in_time as parameter * Fix tests	2024-05-13 15:15:43 +03:00
Mark Karpelès	17fa62d3d3	js : remove un-needed request header from fetchRemote (#2119 )	2024-05-13 15:13:19 +03:00
Daniel Ziegenberg	0bb05b113d	main : dont print timings with --no-prints (#2108 ) Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>	2024-05-13 15:00:19 +03:00
Daniel Ziegenberg	f141b2b938	main : add options for temperature control (#2088 ) Add two options: ``` -tp, --temperature N [0.00 ] The sampling temperature, between 0 and 1 -tpi, --temperature-inc N [0.20 ] The increment of temperature, between 0 and 1 ``` The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit. Signed-off-by: Daniel Ziegenberg <daniel@ziegenberg.at>	2024-05-13 14:59:44 +03:00
zhangjixiong	e93081f83f	whisper.android : update example, add field to print timestamp (#2072 )	2024-05-13 14:30:03 +03:00
Xingchen Song(宋星辰)	b6bbce4ae9	cmake : fix json INTERFACE library (#2069 )	2024-05-13 14:29:39 +03:00
mashizora	7705dc52da	main : fix double quote escaping in csv output (#2090 )	2024-05-13 11:55:32 +03:00
Georgi Gerganov	3fa7d29876	talk-llama : sync llama.cpp	2024-05-13 11:02:26 +03:00
Georgi Gerganov	accada542a	ggml : resolve merge (ggml/0) ggml-ci	2024-05-13 11:02:26 +03:00
Pedro Probst	58210d6a76	examples : fix node compilation (#2115 ) * node : fix compilation and update examples * node : fix readme * Update addon.node test	2024-05-02 22:52:55 +01:00
Georgi Gerganov	b0c3cbf2e8	main : pass nullptr when regex is empty (#2070 )	2024-04-17 12:23:47 +03:00
Emmanuel Schmidbauer	9fab28135c	server : add dtw (#2044 ) * server.cpp: add dtw * Update examples/server/server.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-04-15 22:16:58 +03:00
Pedro Probst	1b5439a6c2	node : support no timestamps (#2048 ) * fix: node: do not compute timestamps if you do not need them * feat: add no_timestamps parameter to node addon	2024-04-15 20:03:34 +03:00
Kendrick Taylor	5c554c04ff	whisper.nvim : fix missing reference to "model" variable (#2049 )	2024-04-15 19:41:28 +03:00
Ikko Eltociear Ashimine	c383f091a1	whisper : update grammar-parser.cpp (#2058 ) preceeding -> preceding	2024-04-15 19:40:27 +03:00
ulatekh	c15b4cda7d	common : fix file-handle leak in read_wav() (#2026 ) Now it cleans up in case of error.	2024-04-09 18:34:34 +03:00
Rotem Dan	d3cfb6ca2b	main : set stdin to binary mode on Windows (#2025 )	2024-04-09 18:33:32 +03:00
ulatekh	671b4bde6c	main : allow a response-file as the sole parameter (#2019 ) * The "main" example now allows a response-file as the sole parameter. A response-file is a text file with command-line parameters, one per line. Prefix the name of the response-file with "@" to identify it as such. It's used under MS Windows to work around command-line length limits. It may be useful under other platforms to simplify character-escaping. * minor : style --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-04-09 18:31:16 +03:00
ulatekh	c8eeb93a6a	whisper : suppress tokens with a regex (#1997 ) * Allow a regular expression to describe tokens to suppress. Example: --suppress-tokens-re "[,\.]\|[ ]?[0-9]+" will suppress commas, periods, and numeric tokens. Technique inspired by https://github.com/openai/whisper/discussions/1041 Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Blind change to fix Java test. --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-04-09 18:27:28 +03:00
ulatekh	319fe5146e	cmake : create solution folders (#2004 ) * Create solution folders in the CMake build. * Fixed non-SDL2 build. * Fixed emscripten build.	2024-04-09 18:23:33 +03:00
Georgi Gerganov	81a3c41aa0	talk-llama : sync llama.cpp	2024-04-07 16:21:08 +03:00
ulatekh	fc366b807a	main : add command-style grammar (#1998 ) * Implemented command-style grammar in the main example. Mostly just copied the relevant parts from the command example. * main : code style --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-03-28 12:02:10 +02:00
Georgi Gerganov	9fb308d90f	make : add grammar parser to common objects	2024-03-28 11:59:48 +02:00
Georgi Gerganov	2948c740a2	sync : ggml (#2001 ) * sync : update scripts * sync : ggml * talk-llama : sync llama.cpp * make : WHISPER_CUBLAS -> WHISPER_CUDA * ci : try to fix sycl build * talk-llama : fix make build	2024-03-27 18:55:10 +02:00
Georgi Gerganov	1558ec5a16	whisper : improve handling of prompts (#1981 ) * whisper : improve handling of prompts * whisper : add whisper_token_count helper	2024-03-25 14:48:19 +02:00
Mohammadreza Hendiani	04e48094e4	readme : add Fedora dependencies (#1970 ) * README.md fix documentaion and added fedora liunx dependencies for stream build * fix documentaion and added fedora liunx dependencies for command build * fix documentaion and added fedora liunx dependencies for talk build * fix documentaion and added fedora liunx dependencies for talk-llama build * reverted back mistakenly removed MacOS documentaion	2024-03-20 18:42:11 +02:00
denersc	741abb162c	whisper : token-level timestamps with DTW (#1485 ) * whisper.cpp: impl dtw algo * WIP: producing and placing DTW timestamps on tokens * Fix compile and assertion errors. Attempt to DTW timestamp with single_segment=false. * Fix mistake causing incorrect alignment of dtw timestamps * implement N_TOP_MOST and CUSTOM alignment heads setting * whisper: fix typo on alignment heads enum * Fix issues related to changes in whisper.cpp * Fixed excessive memory use when using DTW timestamps. Other minor fixes to DTW timestamping function * decoder: save cross QKs only if requested * Calling median filter with ggml_map_custom1 * Reimpl aheads n_top_most and custom. Sanity checks on chosen aheads * Copying cross QKs from decoder backend correctly * dtw: cleanup * Fix incorrect n_frames passed to dtw when near end of audio * Fix aheads_masks_init for backend != CPU * whisper : minor style * main : add dtw (wip) * whisper: fix invalid memory access in aheads_masks_init * main : add dtw (cont) * whisper : minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-03-20 18:25:26 +02:00
Jo Liss	e7794a868f	examples : rename --audio-context to --audio-ctx per help text (#1953 )	2024-03-18 17:53:33 +02:00

1 2 3 4 5 ...

395 Commits