Commit Graph

812 Commits

Author SHA1 Message Date
Georgi Gerganov
c8b3bc6a0d
cuda : use CUBLAS_COMPTE_F32 insted of CUBLAS_COMPUTE_F16 2023-11-27 11:57:07 +02:00
bobqianic
f52e74d4dc
CI : Rectify the Clang-Related workflow issues (#1551)
* fix bugs in workflow

* fix missing clang in workflow

* Update build.yml
2023-11-27 11:35:37 +02:00
Ismatulla Mansurov
23c21e92eb
server : automatically convert audio on the server (#1539)
* server : automatically convert audio on the server

* server : remove rebundant comments

* server : automatic conversion refactor

* server : update server readme

* server : remove unnecessary comments and tabs

* server : put back remove calling

* server : apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : check ffmpeg before the server lunch

* server : fix indentation

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server : fix function typo calling

* server : fix function typo calling

* server : add warning in readme

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-27 11:28:34 +02:00
Georgi Gerganov
447d49530c
whisper : remove trailing whitespaces 2023-11-24 13:13:21 +02:00
Georgi Gerganov
9d6ebd877c
release : v1.5.1 2023-11-24 12:41:55 +02:00
Georgi Gerganov
0ba365f958
metal : add backend function to check device family support (#1547) 2023-11-24 12:37:08 +02:00
Georgi Gerganov
010c8ec3ab
cuda : sync some minor stuff from llama.cpp (#1548) 2023-11-24 12:36:21 +02:00
Georgi Gerganov
ffdb5c4735
whisper : fix typo 2023-11-24 09:45:10 +02:00
ecneladis
a5881d619c
server : add --print-realtime param (#1541)
* server : add --print-realtime param

* Fix duplicate realtime output
2023-11-24 09:35:02 +02:00
bradmit
34f70b3a56
whisper : add whisper_lang_str_full (#1546)
* Update whisper.h

add whisper_lang_fullstr to retrieve the full language name

* Update whisper.cpp

add whisper_lang_fullstr to return the full language name

* fullstr -> str_full

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-24 09:33:13 +02:00
Okabintaro
8328d1900f
fix(server): typo in temperature parameter (#1545)
Also fixed another typo in comments.
2023-11-23 20:59:36 +02:00
sandrohanea
d2bd5f0bdc
metal : fix build (#1544) 2023-11-23 20:20:53 +02:00
Georgi Gerganov
34209a37a2
readme : add server example 2023-11-23 17:20:33 +02:00
Gleicon Moraes
180e062eda
go : fixed Makefile for MacOS ARM 64 (#1530)
* Fixed Makefile for MacOS ARM 64 based on https://github.com/ggerganov/whisper.cpp/issues/1344 + proper ggml-metal env var setting

* conditional to fix broken non-macos compilation

* spaces -> tab

* make : fix whitespaces

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-22 18:08:11 +02:00
Felix
5c7be85fdc
Change temp file name for server application (#1535)
Avoid issue of removing file if it exists in the current working
directory
2023-11-22 09:23:36 +01:00
Georgi Gerganov
146169ec38
bench : pass memcpy threads from cli 2023-11-21 22:27:22 +02:00
Georgi Gerganov
9befab5ab9
bench : multi-thread memcpy (#1534) 2023-11-21 22:07:30 +02:00
Felix
9ac88f2b57
Close file after writing in server application (#1533)
Fix of mistake leaving file open while reading it again as wav
2023-11-21 20:36:10 +01:00
Georgi Gerganov
46f5b6cb08
server : add video to readme 2023-11-21 17:30:43 +02:00
Felix
eff3570f78
server : add a REST Whisper server example with OAI-like API (#1380)
* Add first draft of server

* Added json support and base funcs for server.cpp

* Add more user input via api-request

also some clean up

* Add reqest params and load post function

Also some general clean up

* Remove unused function

* Add readme

* Add exception handlers

* Update examples/server/server.cpp

* make : add server target

* Add magic curl syntax

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-20 21:40:24 +02:00
M. A. Ali
fa19bc4195
whisper : update example in whisper.h (#1529)
update the example in the header, previous examples deprecated.
2023-11-20 20:52:27 +02:00
Georgi Gerganov
a01b2e0971
sdl : fix audio callback (#1523) 2023-11-20 13:16:38 +02:00
Georgi Gerganov
8159a9ab99
whisper : reuse whisper_decode_with_state (#1521) 2023-11-20 13:16:11 +02:00
Tamotsu Takahashi
7516d9c16d
ci : redistribute CUDA DLLs (#1522)
see https://docs.nvidia.com/cuda/eula/index.html#attachment-a
2023-11-19 12:43:22 +02:00
sandrohanea
46cc26d1b9
whisper : fix with_state methods to use the correct state (#1519)
Co-authored-by: Sandro Hanea <sandrohanea@microsoft.com>
2023-11-19 11:25:30 +02:00
Georgi Gerganov
f784f9fa12
whisper : fix overriding the audio context 2023-11-19 10:32:32 +02:00
Georgi Gerganov
ca23f8ee6d
cuda : assert ggml_add sources to be contiguous 2023-11-19 10:32:08 +02:00
Georgi Gerganov
e2f0eba2d4
ios : sync submodule 2023-11-17 10:42:04 +02:00
Georgi Gerganov
d4353e48f7
sync : ggml (ggml-alloc + linker + gguf fixes) (#1501) 2023-11-17 10:00:07 +02:00
Georgi Gerganov
bebf0da983
quantize : add support for K-quant types 2023-11-16 16:18:24 +02:00
Georgi Gerganov
848e54f3ad
bench : fix memcpy bench size 2023-11-16 10:59:32 +02:00
Sam Pullara
7883d1cae4
talk-llama : improve quote and backtick handling (#1364)
* ISSUE-1329: replace " with ' so it doesn't try to execute code in backticks.

* Typo

* Update to keep possessives in the output

Closes the ' then puts a ' in quotes then reopens the ' to escape the ' characters.
2023-11-16 10:34:05 +02:00
Georgi Gerganov
ccc85b4ff8
talk-llama : enable GPU by default 2023-11-15 21:33:00 +02:00
Georgi Gerganov
c7606b47df
models : add info about distilled models 2023-11-15 21:10:13 +02:00
Georgi Gerganov
d38af151a1
release : v1.5.0 2023-11-15 21:02:52 +02:00
Georgi Gerganov
94267df08e
bench-all : add distil models 2023-11-15 20:49:12 +02:00
Georgi Gerganov
8713c67133
js : latest whisper.js 2023-11-15 20:10:16 +02:00
Georgi Gerganov
57a60639bb
bench-all : indentations 2023-11-15 20:01:15 +02:00
Georgi Gerganov
bfbaa4dce5
whisper : make large version explicit + fix data size units (#1493) 2023-11-15 19:42:25 +02:00
Georgi Gerganov
1d79e78402
java : fix test (#1492) 2023-11-15 17:42:53 +02:00
Georgi Gerganov
b6c5f49b78
whisper : add batched decoding (#1486)
* whisper : add whisper_batch

* whisper : move kv_self to whisper_state

* whisper : full batched decoding support

* whisper : fix memory leak in whisper_batch

* whisper : fix mem leak again + remove oboslete function

* whisper : clear kv cache when using whisper_decode API

* whisper : speed-up sampling

* whisper : fix decoders initializer

* bench : add batch size 5 bench

* whisper : add comment about the KV cache size

* whisper : add check for max number of decoders

* whisper : avoid starting sampling threads with bs=1

* whisper : enable beam-search by default

* cuda : sync llama.cpp fixes
2023-11-15 16:12:52 +02:00
Georgi Gerganov
d4231649e6
java : use tiny.en for tests (#1484)
* java : use tiny.en for tests

* java : try to fix full params struct
2023-11-13 16:53:55 +02:00
Evan Jones
3e5c7feeff
whisper : add grammar-based sampling (#1229)
* whisper : add grammar-based sampling

* build : fix after master merge

* command : fix exception when recognizing the command

* whisper : fine-tuning grammar functionality

* command : grammar-related improvements

- option to read grammar from file
- add sample grammars for colors and chess moves
- fine-tune the performance further

* grammars : add assistant + update comments

* command : enable beam-search, add "no_timestamps", add "context", add p

* whisper : remove comment

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-11-13 10:51:34 +02:00
rlapray
c23598e4ca
talk-llama : add n_gpu_layers parameter (#1475) 2023-11-13 10:04:16 +02:00
Tong Li
54a08bde29
examples : add whisper.android.java for compatibility with older Android versions using Java (#1382)
* save the recorded audio to a file

* Alignment -help

* Save the correct audio

* chage to a consistent coding style

* Correct typo

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Correct variable misuse

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* Update examples/stream/stream.cpp

* add *.bin .cxx/ .gradle/ cmake-build-debug/ to gitignore

* add whisper.android.java

* Added support for older versions of Android of Java

* add examples for android java

* add README.md for android java

* add fullTranscribeWithTime

* 增加 toString()方法和测试

* change return type to void

* update to v1.4.1

* add WhisperService

* chage to whisper_full_get_segment_t1

* add method transcribeDataWithTime

* modified toString
```
return "[" + start + " --> " + end + "]:" + sentence;
```

* Optimize code logic

* update text view on handle

* set max lines

* change Chinese to English

* Update bindings/java/build.gradle

* Update .gitignore

* add android.java to github action

* chage android.java to   android_java in build.yml

* remove gradle

* chage jdk to temurin in android_java of CI

* chage jdk to temurin 11 in android_java of CI

* add x to gradlew

* set api-level for android_java of CI

* Update examples/whisper.android.java/app/src/main/jni/whisper/CMakeLists.txt

* add ndk version in build.gradle

* remove local.properties

* add testFullTranscribeWithTime

---------

Co-authored-by: litongmacos <litongjava@qq.com>
Co-authored-by: bobqianic <129547291+bobqianic@users.noreply.github.com>
2023-11-12 18:31:58 +02:00
Georgi Gerganov
9f8bbd3fee
readme : update comment about source code 2023-11-12 17:47:37 +02:00
Georgi Gerganov
3172006a24
ggml : fix some compile warnings 2023-11-12 16:36:20 +02:00
Georgi Gerganov
684bc8bd70
readme : update GPU / CUDA 2023-11-12 15:40:37 +02:00
Georgi Gerganov
b0502836b8
whisper : add full CUDA and Metal offloading (#1472)
* whisper : migrate to ggml-backend

* whisper : fix logit reading

* whisper : fix tensor allocation during load

* whisper : fix beam-search with CUDA

* whisper : free backends + fix compile warning

* whisper : print when CUDA is enabled

* whisper : fix CoreML

* make : clean-up

* talk : fix compile warning

* whisper : support ggml_conv with CUDA and Metal (#1473)

* ggml : add CUDA support for ggml_conv

* whisper : remove ggml_repeat for conv bias + single backend

* cuda : fix im2col kernel

* metal : add im2col support + mul mat-vec f16 x f16

* bench-all : add q4 models

* whisper : clean-up

* quantize-all : fix

* ggml : im2col opts

* whisper : avoid whisper_model_data wrapper

* whisper : add note that ggml_mul_mat_pad does not work with CUDA

* whisper : factor out graph compute in common function

* whisper : fixes

* whisper : fix UB with measure buffers

* whisper : try to fix the parallel whisper_state functionality (#1479)

* whisper : try to fix the parallel whisper_state functionality

* whisper : fix multi-state Metal

* whisper : free backend instances in whisper_state
2023-11-12 15:31:08 +02:00
Ben Nortier
ec7a6f04f9
whisper : return with error from whisper_encode_internal and whisper_decode_internal when abort callback is true (#1456)
Co-authored-by: Ben Nortier <ben@bjnortier.com>
2023-11-10 13:51:16 +02:00