whisper.cpp

extern/whisper.cpp

Fork 1

mirror of https://github.com/ggerganov/whisper.cpp.git synced 2025-07-28 10:19:15 +02:00

Commit Graph

Select branches

Hide Pull Requests

arghh

avx512

batched

bench-memcpy

chess

ci/env

coreml-with-state

cuda-cublas-opts

diarization

distil-support

experiment/model-compression

fa-decoder

feature/debug-gradle-signing

fix-bench

fix-coreml-ane

fix-vzip

fix_vs_sdl2

gg/alloc-enc-results

gg/bench-fix-print

gg/chess

gg/ci-cuda-fix

gg/ci-fix-android

gg/ci-fix-windows

gg/cuda-fix-mmvq

gg/cuda-no-async

gg/disable-cuda-graphs

gg/fix-external-encoder

gg/hipblas-fix

gg/make-fix-glob

gg/objc

gg/prompt-tokens

gg/reduce-ctx-use

gg/wchess

gg/whisper-short-audio-check

ggml-backend

ggml-backend-no-sched

ggml-conv

grammar-debug

guided

java-bindings

large-v3

llama-podcast

macros-cvt-fp16

master

metal

metal-and-alloc

nvblas

parallel-states

quantize-encoder

stream

sync-ggml-25-04-02-2

sync-ggml-25-05-07

sync-ggml-25-05-13

sync-ggml-25-07-28

talk.llama-coreml

threads

timing

try-fix-abort

word-ts-2

#1001

#1002

#1003

#1010

#1012

#1015

#102

#1021

#1021

#1024

#1027

#1029

#1031

#1032

#1034

#1037

#1041

#1042

#1045

#1046

#1049

#1054

#1058

#1060

#1062

#1064

#1067

#107

#1074

#1074

#1077

#1081

#1086

#1086

#1092

#1097

#1097

#110

#1101

#111

#1110

#1111

#1112

#1113

#1114

#1115

#1118

#1118

#1120

#1124

#1128

#1129

#1130

#1131

#1134

#1136

#1137

#114

#1142

#1143

#1144

#1147

#1148

#115

#1154

#116

#1160

#1162

#1164

#1164

#1173

#1174

#1196

#1204

#1205

#1209

#121

#1210

#1211

#1212

#1214

#1216

#1217

#1218

#1220

#1224

#1227

#1228

#1229

#123

#1231

#1235

#1238

#124

#1243

#1247

#1250

#1251

#1253

#1254

#1255

#1261

#1261

#1263

#1264

#1265

#1267

#127

#127

#1270

#1275

#128

#1286

#1290

#1293

#1294

#1298

#130

#130

#1303

#1304

#1305

#1306

#131

#1310

#1313

#1317

#1330

#1334

#1335

#1345

#1349

#135

#1350

#1352

#1356

#1358

#136

#1362

#1364

#1368

#1370

#1375

#1375

#1380

#1381

#1381

#1382

#1389

#1400

#1404

#141

#1415

#1417

#1418

#1418

#1420

#1422

#1424

#143

#1432

#1434

#1440

#1441

#1442

#1444

#1445

#1452

#1455

#1455

#1456

#1457

#1458

#1459

#1462

#1466

#1467

#147

#1472

#1473

#1474

#1475

#1478

#1478

#1479

#1484

#1485

#1486

#1487

#1492

#1493

#1499

#1499

#150

#1500

#1500

#1501

#1505

#1519

#1521

#1522

#1523

#1524

#1524

#1529

#1530

#1533

#1534

#1535

#1539

#1541

#1544

#1545

#1546

#1547

#1548

#1549

#1549

#155

#1551

#1554

#1559

#1559

#1560

#1561

#1563

#1563

#1565

#1567

#1568

#1574

#1575

#1576

#1578

#1582

#1583

#1586

#1588

#1589

#1595

#160

#1602

#1604

#1604

#1605

#1606

#1607

#1615

#1617

#1627

#1627

#163

#1633

#1649

#1649

#1650

#1651

#1655

#1658

#1667

#1669

#1672

#1673

#1674

#1675

#1677

#1679

#1679

#1681

#1691

#1692

#1694

#1695

#170

#1701

#1703

#1704

#1713

#1714

#1716

#1717

#1725

#1727

#1728

#1729

#1735

#174

#1740

#1741

#1744

#1747

#1749

#175

#1750

#1753

#1754

#1755

#1758

#1763

#1764

#1765

#1768

#1768

#1772

#1774

#1778

#1781

#1785

#179

#1791

#1791

#1792

#1802

#1806

#1809

#1812

#1813

#1819

#1823

#1823

#183

#1833

#1833

#1838

#1839

#1840

#1841

#1841

#1842

#1850

#1854

#1854

#1857

#1859

#1860

#1861

#1863

#1865

#1871

#1872

#1874

#1878

#1888

#1889

#1890

#1891

#1895

#1897

#19

#1902

#1913

#1913

#1917

#1924

#1924

#1925

#1926

#1928

#1929

#193

#1932

#1933

#1938

#194

#1942

#1943

#1944

#1945

#1947

#195

#1952

#1952

#1953

#1964

#1965

#1966

#1969

#1969

#1970

#1973

#1973

#1978

#1980

#1981

#1982

#1983

#1990

#1990

#1994

#1997

#1998

#20

#2000

#2001

#2004

#2005

#2005

#201

#2012

#2019

#2020

#2024

#2025

#2026

#203

#203

#2043

#2044

#2045

#2048

#2049

#2054

#2058

#2063

#2068

#2068

#2069

#2070

#2071

#2071

#2072

#2073

#2075

#2075

#2080

#2086

#2088

#2090

#2094

#2095

#2095

#21

#2100

#2102

#2108

#2115

#2119

#2121

#2123

#2127

#2127

#2128

#2129

#2133

#2138

#2142

#2152

#2153

#2154

#2166

#2170

#2181

#2182

#2184

#2184

#2189

#2194

#2196

#2198

#2206

#2208

#2217

#222

#2220

#2227

#2231

#2232

#2234

#2235

#2236

#2237

#2238

#2239

#224

#2240

#2242

#2254

#2254

#2256

#2261

#2264

#2266

#2267

#2270

#2272

#2272

#2279

#2279

#228

#2288

#229

#2290

#2291

#2294

#2299

#23

#230

#2302

#231

#2311

#2324

#2330

#2336

#2339

#2342

#2343

#2346

#2350

#2358

#2360

#2367

#2369

#2369

#2376

#2382

#2383

#2384

#2386

#2387

#239

#2391

#2393

#2396

#24

#2401

#2406

#2406

#2407

#2410

#2414

#2416

#2417

#2419

#2424

#2425

#2427

#2429

#2431

#2432

#2432

#2433

#2440

#2443

#2444

#2449

#245

#2451

#2455

#2464

#2475

#2477

#2481

#2484

#2485

#2488

#2489

#2495

#2505

#2506

#2511

#2515

#2516

#2517

#2518

#2519

#252

#2523

#2525

#2528

#2529

#253

#2534

#254

#2543

#2546

#2547

#2548

#2549

#2550

#2551

#2555

#2560

#2560

#2561

#2562

#2567

#2569

#257

#2570

#2573

#2574

#2576

#2577

#2577

#2579

#2580

#2585

#2589

#2593

#2593

#260

#2604

#2608

#2611

#2613

#2617

#2623

#2624

#2625

#2629

#2633

#2634

#2634

#2635

#2637

#2638

#2639

#2641

#2642

#2643

#2648

#2649

#2653

#2654

#2656

#2659

#2663

#2664

#2670

#2674

#2676

#2683

#2684

#2684

#2686

#2687

#2690

#2690

#2691

#2691

#2692

#2693

#2694

#2694

#2699

#27

#2700

#2707

#2709

#271

#2711

#2716

#2718

#2728

#273

#2734

#2736

#2737

#274

#2745

#2749

#2756

#2759

#2760

#2769

#2769

#277

#2770

#2777

#2779

#2790

#2796

#2797

#2799

#28

#2800

#2800

#2816

#282

#2821

#2822

#2824

#2826

#2826

#2831

#2831

#2832

#2832

#2836

#2838

#2838

#284

#284

#2840

#2842

#2842

#2843

#2844

#2845

#2846

#285

#2851

#2853

#2855

#2858

#286

#2862

#2863

#2868

#287

#2873

#2875

#2876

#2877

#2878

#2879

#288

#2880

#2882

#2887

#2889

#2891

#2893

#2895

#2896

#29

#2900

#2902

#2904

#2905

#2908

#291

#2910

#2911

#2912

#2914

#2915

#2916

#2918

#2919

#2921

#2923

#2924

#2925

#2932

#2935

#2937

#2938

#2939

#294

#2941

#2942

#2943

#2945

#2946

#2947

#2948

#2949

#2951

#2952

#2953

#2955

#2956

#2958

#2959

#296

#2960

#2962

#2966

#2968

#2969

#2971

#2972

#2973

#2975

#2976

#2977

#2979

#298

#2981

#2985

#2986

#2987

#2988

#299

#2990

#2991

#2992

#2993

#2994

#2997

#2999

#3

#3000

#3001

#3002

#3004

#3005

#3006

#3007

#301

#3016

#302

#3021

#3022

#3024

#3025

#3027

#3028

#3029

#3031

#3033

#3038

#3042

#3043

#3044

#3045

#3050

#3052

#3054

#3054

#3055

#3056

#3057

#306

#3060

#3062

#3064

#3065

#3068

#3069

#3070

#3071

#3073

#3075

#3076

#308

#3082

#3083

#3084

#3085

#3086

#3087

#3090

#3097

#3098

#31

#3100

#3101

#3102

#3103

#3104

#3106

#3108

#3109

#3112

#3114

#3120

#3124

#3125

#3126

#3127

#3130

#3131

#3132

#3133

#3134

#3136

#3138

#3140

#3141

#3142

#3143

#3145

#3147

#3148

#3149

#3150

#3151

#3152

#3156

#3157

#3158

#3160

#3160

#3163

#3164

#317

#3170

#3171

#3172

#3173

#3175

#3177

#3178

#3179

#318

#3180

#3181

#3183

#3184

#3185

#3186

#3187

#3189

#319

#3190

#3191

#3192

#3193

#3195

#3196

#3197

#3199

#320

#3200

#3201

#3202

#3203

#3206

#3208

#3209

#3214

#3215

#3217

#3218

#3218

#3219

#322

#3220

#3221

#3222

#3223

#3223

#3229

#323

#3230

#3231

#3233

#3234

#3237

#3239

#324

#3241

#3242

#3243

#3244

#3244

#3245

#3246

#3247

#3251

#3255

#3257

#3257

#3261

#3262

#3264

#3265

#3266

#3268

#3270

#3272

#3273

#3274

#3274

#3275

#3276

#3277

#3281

#3282

#3283

#3284

#3287

#3288

#3289

#3291

#3291

#3292

#3294

#3296

#3298

#3298

#3300

#3301

#3307

#331

#3310

#3313

#3313

#3318

#3319

#3321

#3321

#3322

#3323

#3324

#3325

#3325

#3327

#3328

#3329

#3332

#3333

#3336

#3342

#3342

#336

#34

#340

#343

#343

#345

#346

#349

#350

#351

#353

#357

#359

#36

#362

#365

#366

#368

#369

#379

#38

#381

#383

#384

#387

#388

#390

#391

#398

#404

#409

#41

#415

#42

#424

#425

#43

#431

#435

#436

#439

#443

#444

#446

#451

#453

#454

#454

#455

#456

#459

#461

#462

#468

#473

#474

#476

#482

#484

#485

#486

#494

#495

#497

#500

#501

#502

#502

#503

#506

#515

#520

#523

#532

#534

#537

#538

#540

#542

#552

#563

#566

#569

#572

#576

#58

#583

#60

#600

#605

#613

#613

#615

#619

#624

#624

#626

#627

#628

#629

#629

#638

#640

#642

#645

#648

#649

#650

#650

#659

#659

#664

#668

#67

#677

#682

#685

#686

#687

#688

#697

#70

#704

#706

#710

#711

#712

#716

#718

#72

#720

#721

#725

#728

#733

#737

#739

#740

#755

#759

#760

#763

#764

#768

#77

#776

#78

#798

#81

#810

#811

#812

#815

#816

#832

#833

#834

#835

#836

#837

#842

#845

#853

#854

#862

#863

#867

#87

#871

#871

#874

#875

#883

#885

#890

#891

#893

#899

#902

#908

#910

#915

#926

#927

#931

#935

#939

#939

#94

#944

#95

#956

#964

#968

#968

#971

#971

#972

#995

0.0.5-3

0.0.6-1

1.0.3

1.0.4

1.1.0

1.4.1-1

1.4.1-2

1.5.2

b2365

v1.0.4

v1.1.0

v1.1.1

v1.2.0

v1.2.1

v1.3.0

v1.4.0

v1.4.1

v1.4.2

v1.4.3

v1.5.0

v1.5.1

v1.5.2

v1.5.3

v1.5.4

v1.5.5

v1.6.0

v1.6.1

v1.6.2

v1.7.0

v1.7.1

v1.7.2

v1.7.2-pre

v1.7.3

v1.7.3-pre

v1.7.4

v1.7.4-pre-0

v1.7.4-pre-1

v1.7.5

v1.7.6

8f48565b56 talk-llama : sync llama.cpp sync-ggml-25-07-28 Georgi Gerganov 2025-07-28 10:09:47 +03:00
c189a3c6fc sync : ggml Georgi Gerganov 2025-07-28 08:43:53 +03:00
6ef17cd8e6 vulkan : add fp16 support for the conv_2d kernel (llama/14872) Erik Scholz 2025-07-27 12:04:33 +02:00
429731295f vulkan: skip empty set_rows to avoid invalid API usage (llama/14860) Jeff Bolz 2025-07-27 04:05:34 -05:00
0d5bf5ee87 HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) deepsek 2025-07-26 18:28:14 -04:00
5962f89983 CANN: Implement GLU ops (llama/14884) hipudding 2025-07-26 17:56:18 +08:00
ef7a7f9bcb musa: fix build warnings (unused variable) (llama/14869) R0CKSTAR 2025-07-26 10:36:02 +08:00
7beedb1dee ggml-cpu : disable GGML_NNPA by default due to instability (llama/14880) Aaron Teo 2025-07-26 01:09:03 +08:00
cb0a47be57 metal: SSM_SCAN performance (llama/14743) Gabe Goodhart 2025-07-25 10:47:39 -06:00
5fdfe3bf6b opencl: add fused rms_norm_mul (llama/14841) lhez 2025-07-25 08:12:13 -07:00
58942e76e1 ggml : remove invalid portPos specifiers from dot files (llama/14838) Oliver Simons 2025-07-25 13:29:57 +02:00
0e5d0eedf3 rpc : check for null buffers in get/set/copy tensor endpoints (llama/14868) Chris Rohlf 2025-07-25 06:17:02 -04:00
2feb28a287 sched : fix multiple evaluations of the same graph with pipeline parallelism (llama/14855) Diego Devesa 2025-07-25 01:07:26 -07:00
e577bb1e44 musa: upgrade musa sdk to rc4.2.0 (llama/14498) R0CKSTAR 2025-07-25 03:05:37 +08:00
f8122d2411 cmake : Indent ggml-config.cmake (ggml/1310) Kai Pastor 2025-07-24 19:58:02 +02:00
805c890dc5 sycl: fixed semantics of block offset calculation (llama/14814) Alberto Cabrera Pérez 2025-07-24 11:09:57 +01:00
3a814b91a3 metal : fix fusion across different encoders (llama/14849) Georgi Gerganov 2025-07-24 10:24:05 +03:00
510b3aab2d sycl: fix undefined variable in work group size check (llama/14843) Donghyeon Jeong 2025-07-24 13:50:41 +09:00
9119c3ce49 CUDA: fix overflow in FA, tune performance (llama/14840) Johannes Gäßler 2025-07-23 21:43:25 +02:00
10d2a519d2 CUDA: fix compilation with GGML_CUDA_F16 (llama/14837) Johannes Gäßler 2025-07-23 18:22:30 +02:00
c137464233 CUDA: fix quantized KV cache + multiple sequences (llama/14822) Johannes Gäßler 2025-07-23 12:35:53 +02:00
6ca9a0e490 ggml: fix loongarch quantize_row_q8_1 error (llama/14827) lixing-star 2025-07-23 14:39:51 +08:00
88853c4436 CANN: weight format to NZ for Ascend310P3 (llama/14407) chen fan 2025-07-23 11:58:00 +08:00
c193044b72 CUDA: add fused rms norm (llama/14800) Aman Gupta 2025-07-23 09:25:42 +08:00
7162f92a4b vulkan: fix rms_norm_mul to handle broadcasting dim0 (llama/14817) Jeff Bolz 2025-07-22 10:35:21 -05:00
0e5770ec79 cuda : implement bf16 cpy ops and enable bf16 cont (llama/14763) Sigbjørn Skjæret 2025-07-22 12:33:10 +02:00
bbaaa9372b opencl: remove unreachable return (llama/14806) lhez 2025-07-21 23:53:30 -07:00
d7494d5783 cuda: remove linking to cublasLt (llama/14790) R0CKSTAR 2025-07-22 07:45:26 +08:00
444a0fe79a opencl: fix im2col when KW!=KH (llama/14803) Sigbjørn Skjæret 2025-07-21 22:55:10 +02:00
424218632b opencl: add conv2d kernel (llama/14403) rmatif 2025-07-21 19:03:19 +02:00
06c74b3e3c sycl: Fix im2col (llama/14797) Romain Biessy 2025-07-21 18:39:29 +02:00
1d1d640543 kleidiai: add support for get_rows (llama/14676) Charles Xu 2025-07-21 15:49:52 +02:00
cb0399121b vulkan/cuda: Fix im2col when KW!=KH (llama/14789) Jeff Bolz 2025-07-21 06:35:40 -05:00
3accff3e13 ggml: adds CONV_2D op and direct GEMM Vulkan implementation (llama/14316) Ervin Áron Tasnádi 2025-07-19 21:59:08 +02:00
0c949dbde3 vulkan: Add logging for bf16 features to ggml_vk_print_gpu_info (#13274) (llama/14707) Peter0x44 2025-07-19 16:58:03 +01:00
682df28558 Vulkan: Fix fprintf format-security warning (llama/14770) 0cc4m 2025-07-19 17:47:53 +02:00
0b441feeb2 cmake : fix usage issues (ggml/1257) Kai Pastor 2025-07-22 20:13:21 +02:00
722a96306c ggml-cpu : remove stdlib include from repack.cpp (ggml/1276) Daniel Bevenius 2025-07-21 15:53:12 +02:00
e7bf0294ec Support static xcframework packaging in build-xcframework.sh (#3322) master Rich Waters 2025-07-26 03:25:44 -07:00
7de8dd783f examples : add note about WHISPER_WASM_SINGLE_FILE [no ci] (#3332) Daniel Bevenius 2025-07-24 16:06:48 +02:00
85e474fd55 ci : add paths to build.yml (#3333) Daniel Bevenius 2025-07-24 16:04:21 +02:00
210bbbe4d5 musa: upgrade musa sdk to rc4.2.0 (#3324) R0CKSTAR 2025-07-24 18:19:57 +08:00
1f5cf0b288 server : hide language probabilities option behind flag (#3328) Sacha Arbonel 2025-07-21 13:03:54 +02:00
2e6be2f380 go: fix Mac OS X builds (#3310) BVK Chaitanya 2025-07-21 01:47:35 -05:00
c0dc391349 sync : ggml Georgi Gerganov 2025-07-19 17:48:07 +03:00
0ed687c6f1 metal : fuse add, mul + add tests (llama/14596) Georgi Gerganov 2025-07-18 20:37:26 +03:00
d4a7ea1634 cuda : Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs (llama/14741) Oliver Simons 2025-07-18 13:35:32 +02:00
9a07cb064a CUDA: set_rows + cpy.cu refactor (llama/14712) Aman Gupta 2025-07-18 14:54:18 +08:00
fed20b0682 use max work group size for device to replace the magic number (llama/14732) Neo Zhang Jianyu 2025-07-18 10:23:14 +08:00
17c5411195 ggml: Add initial WebGPU backend (llama/14521) Reese Levine 2025-07-16 08:18:51 -07:00
ae1bb2c8ea llama : add high-throughput mode (llama/14363) Georgi Gerganov 2025-07-16 16:35:42 +03:00
9cc645fec0 ggml : add asserts (llama/14720) Georgi Gerganov 2025-07-16 14:43:32 +03:00
8d1a0485f1 vulkan: fix noncontig check for mat_mul_id splitting (llama/14683) Jeff Bolz 2025-07-15 14:51:09 -05:00
b33841c453 vulkan: add RTE variants for glu/add/sub/mul/div (llama/14653) Jeff Bolz 2025-07-15 14:32:11 -05:00
ab79c6c118 cuda: fix build warnings in set-rows.cu (unused variable) (llama/14687) R0CKSTAR 2025-07-15 15:28:53 +08:00
a6b9271c2c sycl: Hotfix for non dnnl codepath (llama/14677) Anton Mitkov 2025-07-14 18:12:42 +01:00
ded2e3cf6d ggml : refactor llamafile_sgemm PPC code (llama/14673) shalinib-ibm 2025-07-14 18:46:42 +05:30
ebb0e9d0ed SYCL: use 1D kernel for set_rows (llama/14618) Akarshan Biswas 2025-07-14 15:07:55 +05:30
24803d62c6 sycl: Batched mulmat rework for oneDNN dispatch (llama/14617) Anton Mitkov 2025-07-14 10:37:35 +01:00
0611387d17 cuda : add set rows for bf16 (llama/14664) Sigbjørn Skjæret 2025-07-13 15:01:24 +02:00
fe33572b22 cuda : add ELU support (llama/14657) Yavor Ivanov 2025-07-13 02:33:16 -07:00
21308b4e6e ggml : add build-time message to remind about ggml_set_rows (llama/14661) Georgi Gerganov 2025-07-13 10:36:33 +03:00
3cad26d807 metal : Add missing unary ops Metal support (llama/14660) Yavor Ivanov 2025-07-12 22:38:13 -07:00
66b3a39bdc CUDA: add set rows for f32 and f16 (llama/14551) Aman Gupta 2025-07-12 21:31:38 +08:00
032697b9a8 whisper: validate get_rows support for cpu extra buffer (#3323) Charles Xu 2025-07-14 14:13:44 +02:00
a16da91365 examples : update links in wasm examples (#3318) Greg Sadetsky 2025-07-12 17:22:35 -04:00
3775c503d5 sync : resolve conflicts (#0) Georgi Gerganov 2025-07-12 16:26:44 +03:00
6ddff4d96a talk-llama : sync llama.cpp Georgi Gerganov 2025-07-12 16:26:16 +03:00
6d64e4abf3 sync : ggml Georgi Gerganov 2025-07-12 16:22:40 +03:00
85dcc74b88 sync : resolve conflicts (ggml/0) Georgi Gerganov 2025-07-12 14:39:52 +03:00
915fc153a5 vulkan: support SET_ROWS (llama/14587) Jeff Bolz 2025-07-12 05:12:26 -05:00
8670a3fd5d vulkan: optimizations for deepseek prompt processing (llama/14555) Jeff Bolz 2025-07-12 04:51:58 -05:00
74f6d47904 model : support LiquidAI LFM2 hybrid family (llama/14620) Tarek Dakhran 2025-07-11 20:27:01 +02:00
a4ff4ec9cb HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634) Slobodan Josic 2025-07-11 18:55:00 +02:00
b0754136be opencl: add tiled mul_mat_f16_f32 (llama/14535) rmatif 2025-07-10 23:58:12 +02:00
6f113cbcaa opencl: add set_rows for f16 and f32 (llama/14547) lhez 2025-07-10 11:48:52 -07:00
3c21cde540 SYCL: Initial set_rows kernel implementation (llama/14562) Akarshan Biswas 2025-07-10 13:59:38 +05:30
fb885fa48b cuda : support Falcon-H1 state size for SSM_SCAN (llama/14602) compilade 2025-07-09 23:54:38 -04:00
2021870fb8 ggml : add ggml_scale_bias (llama/14417) Xuan-Son Nguyen 2025-07-09 18:16:12 +02:00
48b18f9eb8 ggml : prevent integer overflow in gguf tensor size calculation (llama/14595) Miaoqian Lin 2025-07-09 20:33:53 +08:00
fadb3233b6 vulkan: optimize flash attention split_k_reduce (llama/14554) Jeff Bolz 2025-07-08 13:11:42 -05:00
9750e4c988 vulkan : fix rope with partial rotation and non-cont src (llama/14582) Jeff Bolz 2025-07-08 08:21:21 -05:00
c3942b3db6 cuda : fix rope with partial rotation and non-cont src (llama/14580) Georgi Gerganov 2025-07-08 10:15:21 +03:00
98e7beac6c CUDA: add bilinear interpolation for upscale (llama/14563) Aman Gupta 2025-07-08 10:11:18 +08:00
7e9c6bbab2 musa: fix build warnings (unused variable) (llama/14561) R0CKSTAR 2025-07-08 07:58:30 +08:00
8e545f466c CUDA: add bf16 and i32 to getrows (llama/14529) Aman Gupta 2025-07-07 21:45:43 +08:00
e753b9a952 vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (llama/14485) Eve 2025-07-06 10:29:36 +00:00
9d0c408260 vulkan: fix rms_norm+mul fusion (llama/14545) Jeff Bolz 2025-07-06 03:08:16 -05:00
3aebb8d5d3 vulkan: Handle updated FA dim2/3 definition (llama/14518) Jeff Bolz 2025-07-05 02:26:04 -05:00
df5af1dc75 opencl: add GELU_ERF (llama/14476) Sigbjørn Skjæret 2025-07-05 08:24:56 +02:00
10d0d28f7c metal : disable fast math in all quantize kernels (llama/14528) Georgi Gerganov 2025-07-04 19:19:09 +03:00
af304ef080 CANN: Replace aclrtMemsetSync with aclnnInplaceZero operator (llama/14002) luyhcsu 2025-07-04 11:50:07 +08:00
e8138c51d2 ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445) Sigbjørn Skjæret 2025-07-03 23:07:22 +02:00
7cec4cc83a opencl : broadcast for soft_max (llama/14510) lhez 2025-07-03 11:22:24 -07:00
a432929d58 vulkan: support mixed/deepseekR1 FA head sizes (llama/14509) Jeff Bolz 2025-07-03 13:21:14 -05:00
4aaf8114e7 ggml: backward pass for split swiglu (llama/14483) Johannes Gäßler 2025-07-03 17:05:18 +02:00
0ca760433c Fix conditional enabling following arch checks for ggml-sycl (llama/14504) Nicolò Scipione 2025-07-03 11:00:03 +02:00
ed639c7f22 kv-cache : use ggml_set_rows (llama/14285) Georgi Gerganov 2025-07-03 10:53:35 +03:00
0abd0660e1 ggml : fix FA mask dim 2 and 3 (llama/14505) Georgi Gerganov 2025-07-03 10:46:57 +03:00
9cde908c0a CUDA: add dynamic shared mem to softmax, refactor general usage (llama/14497) Aman Gupta 2025-07-03 07:45:11 +08:00