Jhen-Jie Hong
|
a4bb2df36a
|
quantize : fix load vocab crash when len is 128 (#1160)
* quantize : fix load vocab crash when len is 128
* ci : add quantize job
|
2023-08-06 11:04:42 +03:00 |
|
Georgi Gerganov
|
d6509bf78d
|
ggml : sync latest repo (mostly refactoring changes)
|
2023-07-02 21:46:09 +03:00 |
|
Georgi Gerganov
|
5feb0dffba
|
ggml : sync latest ggml lib
|
2023-06-25 14:30:44 +03:00 |
|
Georgi Gerganov
|
e693074aa6
|
ggml : sync latest ggml
- New Q4 and Q5 formats
- Various improvements
|
2023-05-14 18:04:23 +03:00 |
|
Georgi Gerganov
|
794b162a46
|
whisper : add integer quantization support (#540)
* whisper : add integer quantization support
* examples : add common-ggml + prepare to add "quantize" tool
* whisper : quantization tool ready
* whisper : fix F32 support
* whisper : try to fix shared lib linkage
* wasm : update quantized models to Q5
* bench.wasm : remove "medium" button
* bench.wasm : fix custom model button
* ggml : add Q5_0 and Q5_1 WASM SIMD
* wasm : add quantized models to all WASM examples
* wasm : bump DB version number to 2
* talk-llama : update example to latest llama.cpp
* node : increase test timeout to 10s
* readme : add information for model quantization
* wasm : add links to other examples
|
2023-04-30 18:51:57 +03:00 |
|