whisper.cpp

extern/whisper.cpp

Fork 1

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-08-03 04:19:56 +02:00

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

ci/env

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

fix_vs_sdl2

gg/alloc-enc-results

gg/bench-fix-print

gg/chess

gg/ci-cuda-fix

gg/ci-fix-android

gg/ci-fix-windows

gg/cuda-fix-mmvq

gg/cuda-no-async

gg/disable-cuda-graphs

gg/fix-external-encoder

gg/hipblas-fix

gg/make-fix-glob

gg/objc

gg/prompt-tokens

gg/reduce-ctx-use

gg/wchess

gg/whisper-short-audio-check

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

sync-ggml-25-04-02-2

sync-ggml-25-05-07

sync-ggml-25-05-13

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1980

#1981

#1982

#1983

#1990

#1990

#1994

#1997

#1998

#20

#2000

#2001

#2004

#2005

#2005

#201

#2012

#2019

#2020

#2024

#2025

#2026

#203

#203

#2043

#2044

#2045

#2048

#2049

#2054

#2058

#2063

#2068

#2068

#2069

#2070

#2071

#2071

#2072

#2073

#2075

#2075

#2080

#2086

#2088

#2090

#2094

#2095

#2095

#21

#2100

#2102

#2108

#2115

#2119

#2121

#2123

#2127

#2127

#2128

#2129

#2133

#2138

#2142

#2152

#2153

#2154

#2166

#2170

#2181

#2182

#2184

#2184

#2189

#2194

#2196

#2198

#2206

#2208

#2217

#222

#2220

#2227

#2231

#2232

#2234

#2235

#2236

#2237

#2238

#2239

#224

#2240

#2242

#2254

#2254

#2256

#2261

#2264

#2266

#2267

#2270

#2272

#2272

#2279

#2279

#228

#2288

#229

#2290

#2291

#2294

#2299

#23

#230

#2302

#231

#2311

#2324

#2330

#2336

#2339

#2342

#2343

#2346

#2350

#2358

#2360

#2367

#2369

#2369

#2376

#2382

#2383

#2384

#2386

#2387

#239

#2391

#2393

#2396

#24

#2401

#2406

#2406

#2407

#2410

#2414

#2416

#2417

#2419

#2424

#2425

#2427

#2429

#2431

#2432

#2432

#2433

#2440

#2443

#2444

#2449

#245

#2451

#2455

#2464

#2475

#2477

#2481

#2484

#2485

#2488

#2489

#2495

#2505

#2506

#2511

#2515

#2516

#2517

#2518

#2519

#252

#2523

#2525

#2528

#2529

#253

#2534

#254

#2543

#2546

#2547

#2548

#2549

#2550

#2551

#2555

#2560

#2560

#2561

#2562

#2567

#2569

#257

#2570

#2573

#2574

#2576

#2577

#2577

#2579

#2580

#2585

#2589

#2593

#2593

#260

#2604

#2608

#2611

#2613

#2617

#2623

#2624

#2625

#2629

#2633

#2634

#2634

#2635

#2637

#2638

#2639

#2641

#2642

#2643

#2648

#2649

#2653

#2654

#2656

#2659

#2663

#2664

#2670

#2674

#2676

#2683

#2684

#2684

#2686

#2687

#2690

#2690

#2691

#2691

#2692

#2693

#2694

#2694

#2699

#27

#2700

#2707

#2709

#271

#2711

#2716

#2718

#2728

#273

#2734

#2736

#2737

#274

#2745

#2749

#2756

#2759

#2760

#2769

#2769

#277

#2770

#2777

#2779

#2790

#2796

#2797

#2799

#28

#2800

#2800

#2816

#282

#2821

#2822

#2824

#2826

#2826

#2831

#2831

#2832

#2832

#2836

#2838

#2838

#284

#284

#2840

#2842

#2842

#2843

#2844

#2845

#2846

#285

#2851

#2853

#2855

#2858

#286

#2862

#2863

#2868

#287

#2873

#2875

#2876

#2877

#2878

#2879

#288

#2880

#2882

#2887

#2889

#2891

#2893

#2895

#2896

#29

#2900

#2902

#2904

#2905

#2908

#291

#2910

#2911

#2912

#2914

#2915

#2916

#2918

#2919

#2921

#2923

#2924

#2925

#2932

#2935

#2937

#2938

#2939

#294

#2941

#2942

#2943

#2945

#2946

#2947

#2948

#2949

#2951

#2952

#2953

#2955

#2956

#2958

#2959

#296

#2960

#2962

#2966

#2968

#2969

#2971

#2972

#2973

#2975

#2976

#2977

#2979

#298

#2981

#2985

#2986

#2987

#2988

#299

#2990

#2991

#2992

#2993

#2994

#2997

#2999

#3

#3000

#3001

#3002

#3004

#3005

#3006

#3007

#301

#3016

#302

#3021

#3022

#3024

#3025

#3027

#3028

#3029

#3031

#3033

#3038

#3042

#3043

#3044

#3045

#3050

#3052

#3054

#3054

#3055

#3056

#3057

#306

#3060

#3062

#3064

#3065

#3068

#3069

#3070

#3071

#3073

#3075

#3076

#308

#3082

#3083

#3084

#3085

#3086

#3087

#3090

#3097

#3098

#31

#3100

#3101

#3102

#3103

#3104

#3106

#3108

#3109

#3112

#3114

#3120

#3124

#3125

#3126

#3127

#3130

#3131

#3132

#3133

#3134

#3136

#3138

#3140

#3141

#3142

#3143

#3145

#3147

#3148

#3149

#3150

#3151

#3152

#3156

#3157

#3158

#3160

#3160

#3163

#3164

#317

#3170

#3171

#3172

#3173

#3175

#3177

#3178

#3179

#318

#3180

#3181

#3183

#3184

#3185

#3186

#3187

#3189

#319

#3190

#3191

#3192

#3193

#3195

#3196

#3197

#3199

#320

#3200

#3201

#3202

#3203

#3206

#3208

#3209

#3214

#3215

#3217

#3218

#3218

#3219

#322

#3220

#3221

#3222

#3223

#3223

#3229

#323

#3230

#3231

#3233

#3234

#3237

#3239

#324

#3241

#3242

#3243

#3244

#3244

#3245

#3246

#3247

#3251

#3255

#3257

#3257

#3261

#3262

#3264

#3265

#3266

#3268

#3270

#3272

#3273

#3274

#3274

#3275

#3276

#3277

#3281

#3282

#3283

#3284

#3287

#3288

#3289

#3291

#3291

#3292

#3294

#3296

#3298

#3298

#3300

#3301

#3307

#331

#3310

#3313

#3313

#3318

#3319

#3321

#3321

#3322

#3323

#3324

#3325

#3325

#3327

#3328

#3329

#3332

#3333

#3336

#3342

#3346

#3349

#3350

#3354

#336

#34

#340

#343

#343

#345

#346

#349

#350

#351

#353

#357

#359

#36

#362

#365

#366

#368

#369

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

b2365

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

v1.5.5

v1.6.0

v1.6.1

v1.6.2

v1.7.0

v1.7.1

v1.7.2

v1.7.2-pre

v1.7.3

v1.7.3-pre

v1.7.4

v1.7.4-pre-0

v1.7.4-pre-1

v1.7.5

v1.7.6

e27c91f6d6 rpc : add rpc_msg_set_tensor_hash_req (llama/13353) Radoslav Gerganov 2025-05-09 10:31:07 +03:00
e46df4850f vulkan: Allow up to 4096 elements for mul_mat_id row_ids (llama/13326) Jeff Bolz 2025-05-09 02:23:41 -05:00
e8a7f1b7bb sycl: addressing non-contiguous src1 mul_mats (nc and batched) (llama/13343) Alberto Cabrera Pérez 2025-05-08 10:08:01 +01:00
fbad8058c4 examples : add VAD speech segments example (#3147) Daniel Bevenius 2025-05-13 12:31:00 +02:00
bff8dc248a talk-llama : sync llama.cpp sync-ggml-25-05-13 Georgi Gerganov 2025-05-13 13:20:19 +03:00
69753804ed whisper : update to ggml-backend changes (#0) Georgi Gerganov 2025-05-13 13:11:24 +03:00
89970b9aaa sync : ggml Georgi Gerganov 2025-05-13 13:10:17 +03:00
79fb43e252 ggml : add mrope kernel for metal (llama/13457) Xuan-Son Nguyen 2025-05-13 13:10:08 +03:00
926e06dbfd metal : optimize MoE for large batches (llama/13388) Georgi Gerganov 2025-05-13 13:09:20 +03:00
43a59eccf6 opencl: remove unnecessary assert for add (llama/13257) lhez 2025-05-12 13:13:49 -07:00
fe0d52b9a2 llama/ggml: add LLM training support (llama/10544) Johannes Gäßler 2025-05-12 14:44:49 +02:00
cb90cb0992 ggml-cpu: Integrate fp32=bf16xbf16 SME KleidiAI kernel (llama/13053) Dan Johansson 2025-05-12 13:06:19 +02:00
8264872b5d CUDA: fix misaligned synchronization in FA (llama/13469) Johannes Gäßler 2025-05-12 10:51:21 +02:00
882d975729 enable dpcpp nightly builds with libraries (llama/13406) Atharva Dubey 2025-05-12 06:15:32 +01:00
c426829771 CUDA: fix crash with partial offloading of MoE (llama/13439) Johannes Gäßler 2025-05-11 16:09:33 +02:00
0b1962a181 Add --no-op-offload to improve -ot pp perf in MoE models like llama4 400B (llama/13386) David Huang 2025-05-11 20:18:39 +08:00
86dece9c7c CUDA: fix race conditions FlashAttention kernels (llama/13438) Johannes Gäßler 2025-05-10 22:22:48 +02:00
04445664b4 CUDA: fix FlashAttention on Turing (llama/13415) Johannes Gäßler 2025-05-10 09:16:52 +02:00
22f4997dd8 vulkan: scalar flash attention implementation (llama/13324) Jeff Bolz 2025-05-09 23:07:07 -07:00
b493e03b90 sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs (llama/12858) Alberto Cabrera Pérez 2025-05-09 16:34:08 +01:00
aef59f4851 CUDA: FA support for Deepseek (Ampere or newer) (llama/13306) Johannes Gäßler 2025-05-09 13:34:58 +02:00
f8c75dc43e CUDA: fix crash on large batch size for MoE models (llama/13384) Johannes Gäßler 2025-05-09 12:14:04 +02:00
00c8056715 rpc : add rpc_msg_set_tensor_hash_req (llama/13353) Radoslav Gerganov 2025-05-09 10:31:07 +03:00
19d8d9a928 vulkan: Allow up to 4096 elements for mul_mat_id row_ids (llama/13326) Jeff Bolz 2025-05-09 02:23:41 -05:00
0c4a229154 sycl: addressing non-contiguous src1 mul_mats (nc and batched) (llama/13343) Alberto Cabrera Pérez 2025-05-08 10:08:01 +01:00
b2513a6208 vad : remove shortform for --vad option in cli.cpp (#3145) Daniel Bevenius 2025-05-13 06:04:05 +02:00
587ea01f55 docs : update README.md for whisper.objc app (#2569) Tomer Schlesinger 2025-05-13 06:03:50 +02:00
e41bc5c61a vad : add initial Voice Activity Detection (VAD) support (#3065) Daniel Bevenius 2025-05-12 16:10:11 +02:00
e39ba750cd whisper : remove dummy commit comment [no ci] (#3143) Daniel Bevenius 2025-05-12 14:40:17 +02:00
db0fc9edc6 docs : fix -owts flag typo karaoke section [no ci] (#3142) Daniel Bevenius 2025-05-12 10:56:39 +02:00
186855e38b cli : print color scheme info for --print-colors (#3141) Daniel Bevenius 2025-05-12 10:43:04 +02:00
a513146102 docs : update Readme to recommend same Openvino as Python tools (#3138) Simon Booth 2025-05-12 08:06:51 +01:00
4730950492 examples : update link to Paul Tol's color scheme [no ci] (#3140) Daniel Bevenius 2025-05-12 09:02:06 +02:00
9dd9685c79 ruby : test extra build options only when env var specified (#3136) KITAITI Makoto 2025-05-12 13:49:46 +09:00
2e310b841e ruby : omit test_build_options locally (#3132) Daniel Bevenius 2025-05-10 08:18:08 +02:00
5d4390d281 examples : add HEAPU8 to all of the exported runtime methods (#3134) Enes Grahovac 2025-05-10 00:44:13 -04:00
9791647653 wasm : add note about worker.js file generation [no ci] (#3133) Daniel Bevenius 2025-05-09 15:42:45 +02:00
288304ee64 whisper : deprecate WHISPER_CCACHE CMake option (#3131) Daniel Bevenius 2025-05-09 14:13:41 +02:00
b6f3fa4059 stream.wasm : add HEAPU8 to exported runtime methods (#3130) Daniel Bevenius 2025-05-08 16:58:34 +02:00
cb2bd11ee8 sync : ggml Georgi Gerganov 2025-05-07 17:45:14 +03:00
09e6b66025 cuda : remove nrows_x in mul_mat_q_process_tile (llama/13325) R0CKSTAR 2025-05-07 15:48:23 +08:00
d41cf26a0f CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (llama/13135) Johannes Gäßler 2025-05-06 23:35:51 +02:00
3c67195be9 SYCL: Disable reorder optimize by default and stop setting tensor extras when optimize is disabled (llama/13254) Akarshan Biswas 2025-05-06 20:27:06 +05:30
f9f78a773f CUDA: fix bad asserts for partial offload (llama/13337) Johannes Gäßler 2025-05-06 13:58:51 +02:00
be55e25cac CUDA: fix --split-mode row for MMQ (llama/13323) Johannes Gäßler 2025-05-06 08:36:46 +02:00
2ffdda99e8 CUDA: fix logic for clearing padding with -ngl 0 (llama/13320) Johannes Gäßler 2025-05-05 22:32:13 +02:00
9bbedc51cc SYCL: Disable mul_mat kernels for noncontiguous tensor b (llama/13308) Akarshan Biswas 2025-05-05 13:39:10 +05:30
1e1fa27add rpc : use backend registry, support dl backends (llama/13304) Diego Devesa 2025-05-04 21:25:43 +02:00
e1bdd148c5 ggml : activate s390x simd for Q3_K (llama/13301) Aaron Teo 2025-05-05 01:49:12 +08:00
7fa8bb303f CUDA: fix race condition in MMQ stream-k fixup (llama/13299) Johannes Gäßler 2025-05-04 14:16:39 +02:00
7564f5e6f1 CUDA: fix race condition in MMQ ids_dst (llama/13294) Johannes Gäßler 2025-05-04 13:58:38 +02:00
22ba2e27ce vulkan: Additional type support for unary, binary, and copy (llama/13266) Jeff Bolz 2025-05-04 00:17:16 -05:00
0676b2dab2 ci : add bindings-java jar artifact to release (#3126) Daniel Bevenius 2025-05-07 16:26:54 +02:00
4a512cb153 cli : avoid std::exchange Georgi Gerganov 2025-05-07 13:22:47 +03:00
76171ce199 sync : ggml Georgi Gerganov 2025-05-07 13:17:48 +03:00
5eac2a3fbb vulkan : fix lint (llama/0) Georgi Gerganov 2025-05-02 20:57:07 +03:00
42938398f9 ggml : Enable MMA for BF16 in llamafile_sgemm (llama/13148) shalinib-ibm 2025-05-02 22:23:12 +05:30
a8fe90ae15 rpc : avoid uninitialized memory in serialize_tensor (llama/13210) Justin Santa Barbara 2025-05-01 17:32:11 -04:00
c5a5a2da5b ggml: Don't assert fail when tensor data changes (llama/13222) Jesse Gross 2025-05-01 13:46:10 -07:00
8316bfd82b build : fix build info on windows (llama/13239) Diego Devesa 2025-05-01 21:48:08 +02:00
fd1cb9fc12 vulkan: Add bfloat16 support (llama/12554) Jeff Bolz 2025-05-01 13:49:39 -05:00
17f6b8225e vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader (llama/13191) Jeff Bolz 2025-05-01 13:19:31 -05:00
6374ea32ca vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204) Acly 2025-05-02 18:02:34 +02:00
3a66f9f248 ci : zip windows artifacts for release uploading (#3124) Daniel Bevenius 2025-05-07 13:12:08 +02:00
0055356fbc cli : avoid std::exchange sync-ggml-25-05-07 Georgi Gerganov 2025-05-07 13:22:47 +03:00
eeaa1cd035 sync : ggml Georgi Gerganov 2025-05-07 13:17:48 +03:00
a652c8bf72 vulkan : fix lint (llama/0) Georgi Gerganov 2025-05-02 20:57:07 +03:00
0630539c8a ggml : Enable MMA for BF16 in llamafile_sgemm (llama/13148) shalinib-ibm 2025-05-02 22:23:12 +05:30
a7988d76db rpc : avoid uninitialized memory in serialize_tensor (llama/13210) Justin Santa Barbara 2025-05-01 17:32:11 -04:00
37ac0264ef ggml: Don't assert fail when tensor data changes (llama/13222) Jesse Gross 2025-05-01 13:46:10 -07:00
5a9ccde7da build : fix build info on windows (llama/13239) Diego Devesa 2025-05-01 21:48:08 +02:00
cde0e50536 vulkan: Add bfloat16 support (llama/12554) Jeff Bolz 2025-05-01 13:49:39 -05:00
df458380d6 vulkan: Handle src1 batch dimension in non-contiguous mat-vec-mul shader (llama/13191) Jeff Bolz 2025-05-01 13:19:31 -05:00
87b88ed01c vulkan : kernels for depthwise 2D convolution (CONV_2D_DW) (ggml/1204) Acly 2025-05-02 18:02:34 +02:00
9b584b0cc0 ci : add zip extension to xcframework artifact name (#3120) Daniel Bevenius 2025-05-07 12:02:29 +02:00
09846f4e12 whisper: remove MSVC warnings pragmas (#3090) Daniel Bevenius 2025-05-05 13:09:35 +02:00
bcf1ed0163 server: update abort mechanism to handle HTTP connection closure (#3112) Sacha Arbonel 2025-05-05 07:16:54 +02:00
934d4b3083 cli : support "-" for stdout like stdin (#3050) Daniel Tang 2025-05-05 01:15:39 -04:00
988dcd4b5b docs : Update cli documentation (#3102) Arpit Jain 2025-05-02 20:18:33 +08:00
9f540ad8cb cmake : removed stdc++fs (#3097) Jared Tweed 2025-05-02 02:41:35 -07:00
1fa17bc752 server : update httplib.h to version 0.20.0 (#3101) Sacha Arbonel 2025-05-02 06:09:41 +02:00
366082d072 ruby : refine HTTP cache feature (#3109) KITAITI Makoto 2025-05-01 23:04:53 +09:00
0778b6ff5f talk-llama : sync llama.cpp Georgi Gerganov 2025-05-01 10:43:30 +03:00
5cd59c9396 sync : ggml Georgi Gerganov 2025-05-01 10:42:48 +03:00
d052e64d42 CUDA: batched+noncont MMQ, refactor bs>1 MoE code (llama/13199) Johannes Gäßler 2025-04-30 23:12:59 +02:00
780750a108 vulkan: use uint array index to avoid glslang bug (llama/13193) Jeff Bolz 2025-04-30 07:38:37 -05:00
919c78e618 ggml : fix ppc64le build (llama/13176) shalinib-ibm 2025-04-30 16:47:08 +05:30
dc288f84cd feat(ggml-cpu): enable z17 compile (llama/13182) Aaron Teo 2025-04-30 17:47:35 +08:00
1543a3600c CUDA: fix non-cont. inputs for batched mat mul (llama/13155) Johannes Gäßler 2025-04-29 16:00:27 +02:00
4872355f6e fix(rpc): Improve input validation and error handling (llama/13069) Ville Vesilehto 2025-04-28 21:00:20 +03:00
1a76e97c28 SYCL: Add all missing unary kernels (llama/13074) Akarshan Biswas 2025-04-28 15:03:25 +05:30
7017c1d37d musa: fix typo in cc control (llama/13144) R0CKSTAR 2025-04-28 15:33:28 +08:00
670bf02662 CUDA: fix q_nope_absorbed prec for DS 2 Lite f16 (llama/13137) Johannes Gäßler 2025-04-28 09:29:26 +02:00
9fff2f751c musa: fix build warning (llama/13129) R0CKSTAR 2025-04-27 19:22:49 +08:00
46392f733f ggml: move fp16/bf16 conversion optimizations to CPU backend + export conversion APIs (llama/13107) SXX 2025-04-26 22:05:31 +08:00
eeb259909e change the reorder tensor from init to execute OP (llama/13003) Neo Zhang Jianyu 2025-04-25 17:37:51 +08:00
fe21ddf0dc rpc : do not wait for response when sending RPC_CMD_SET_TENSOR (llama/12943) Radoslav Gerganov 2025-04-25 10:08:08 +03:00
33bdbfbb33 ggml : fix ggml_gallocr_ptr type (ggml/1205) Diego Devesa 2025-04-30 15:20:40 +02:00
0f49edf0f3 whisper : add check that target name exists (#3103) Daniel Bevenius 2025-05-01 10:05:24 +02:00
25efcfe3ed server : add --no-gpu option to print usage output (#3098) Daniel Bevenius 2025-05-01 08:15:12 +02:00