Commit Graph

  • 7ac2f17fac cuda : only use native when supported by cmake (llama/10389) Diego Devesa 2024-11-18 18:43:40 +0100
  • 48862c7b27 vulkan: remove use of null initializer (llama/10372) Jeff Bolz 2024-11-18 08:28:42 -0600
  • 44f7d9f4e3 metal : fox offset integer overflows in im2col (ggml/1015) Plamen Minev 2024-11-18 15:02:27 +0200
  • fd12302587 Vulkan: Fix device info output format specifiers (llama/10366) 0cc4m 2024-11-18 11:02:43 +0100
  • f80bef4630 metal : add GGML_UNARY_OP_ELU kernel (ggml/1018) PAB 2024-11-18 10:02:49 +0100
  • 161b443514 CUDA: fix MMV kernel being used for FP16 src1 (llama/10357) Johannes Gäßler 2024-11-17 23:20:42 +0100
  • ef7fbe1c66 CMake: fix typo in comment [no ci] (llama/10360) Johannes Gäßler 2024-11-17 12:59:38 +0100
  • 0879d3599e llama : only use default buffer types for the KV cache (llama/10358) Diego Devesa 2024-11-17 12:25:45 +0100
  • 2a444dc5bd metal : refactor kernel args into structs (llama/10238) Georgi Gerganov 2024-11-17 11:23:01 +0200
  • 45cf1634dc ggml : fix undefined reference to 'getcpu' (llama/10354) FirstTimeEZ 2024-11-17 21:39:22 +1300
  • dcb2922d1d CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) Johannes Gäßler 2024-11-17 09:09:55 +0100
  • 3c5c751174 CMake: default to -arch=native for CUDA build (llama/10320) Johannes Gäßler 2024-11-17 09:06:34 +0100
  • 24ad19d0e9 ggml : fix possible buffer use after free in sched reserve (llama/9930) Diego Devesa 2024-11-17 07:31:17 +0100
  • bd574b05af ggml : inttypes.h -> cinttypes (llama/0) Georgi Gerganov 2024-11-16 23:40:39 +0200
  • 7e0eafcb1e ggml : adapt AMX to tensor->grad removal (llama/0) Georgi Gerganov 2024-11-16 21:38:01 +0200
  • 75670ae673 ggml : fix compile warnings (llama/0) Georgi Gerganov 2024-11-16 21:32:41 +0200
  • d4fcdf602b llamafile : fix include path (llama/0) Georgi Gerganov 2024-11-16 17:58:56 +0200
  • 1bebb1a116 vulkan: Optimize some mat-vec mul quant shaders (llama/10296) Jeff Bolz 2024-11-16 00:26:57 -0600
  • ee437cde59 ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324) Dan Johansson 2024-11-16 01:53:37 +0100
  • c1506d38cf Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314) Srihari-mcw 2024-11-16 02:57:00 +0530
  • c9541741e6 ggml: new optimization interface (ggml/988) Johannes Gäßler 2024-11-16 13:49:35 +0100
  • 6a55015dc4 ggml : remove duplicated sources from the last sync (ggml/1017) Georgi Gerganov 2024-11-15 23:52:31 +0200
  • 7e86030d4d ggml : fix some build issues slaren 2024-11-15 20:20:54 +0100
  • 401fbea326 sync : leftovers (ggml/0) Georgi Gerganov 2024-11-15 21:43:41 +0200
  • 44d1cbdfe9 cmake : restore CMakeLists.txt (llama/10256) Georgi Gerganov 2024-11-15 21:35:51 +0200
  • 3216efef2e AVX BF16 and single scale quant optimizations (llama/10212) Eve 2024-11-15 11:47:58 +0000
  • 2c0484ebf7 sycl: Use syclcompat::dp4a (llama/10267) Romain Biessy 2024-11-15 04:09:12 +0100
  • 3298916e5e backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921) Charles Xu 2024-11-15 01:28:50 +0100
  • 746bf2596f ggml : build backends as libraries (llama/10256) Diego Devesa 2024-11-14 18:04:35 +0100
  • 5f7e094ccb scripts : update sync Georgi Gerganov 2024-11-19 18:59:18 +0200
  • e6114173b8
    whisper : use backend registry (#0) Georgi Gerganov 2024-11-20 15:32:34 +0200
  • 85ff4f974e Fix crash in ggml_vk_print_gpu_info Juliusz Chroboczek 2024-11-20 16:35:35 +0100
  • c800966378 ggml/sched : do not skip views in pre-assignments slaren 2024-11-20 13:25:08 +0100
  • 8c24c64924
    whisper : adapt to new ggml (wip) Georgi Gerganov 2024-11-19 19:09:07 +0200
  • 4e1f516ecc
    talk-llama : sync llama.cpp Georgi Gerganov 2024-11-19 19:08:57 +0200
  • 0eddc9fcbc
    sync : ggml Georgi Gerganov 2024-11-19 19:04:21 +0200
  • 52799f9082
    ggml : sync resolve (skip) (#0) Georgi Gerganov 2024-11-19 19:03:47 +0200
  • bfaf1fc76f
    Add required ggml-base and backend libs to cmake pkg (llama/10407) bandoti 2024-11-19 12:10:30 -0400
  • 166237d07e
    cuda : fix CUDA_FLAGS not being applied (llama/10403) Diego Devesa 2024-11-19 14:29:38 +0100
  • d2aaf9ecfc
    sycl : Add option to set the SYCL architecture for all targets (llama/10266) Romain Biessy 2024-11-19 09:02:23 +0100
  • 29894ef822
    vulkan: Optimize soft_max (llama/10301) Jeff Bolz 2024-11-19 01:25:17 -0600
  • 8d6e30fb61
    sycl: Revert MUL_MAT_OP support changes (llama/10385) Alberto Cabrera Pérez 2024-11-19 00:50:04 +0000
  • 761d310e78
    cuda : only use native when supported by cmake (llama/10389) Diego Devesa 2024-11-18 18:43:40 +0100
  • c4f4639466
    vulkan: remove use of null initializer (llama/10372) Jeff Bolz 2024-11-18 08:28:42 -0600
  • c157f624e2
    metal : fox offset integer overflows in im2col (ggml/1015) Plamen Minev 2024-11-18 15:02:27 +0200
  • 748d633638
    Vulkan: Fix device info output format specifiers (llama/10366) 0cc4m 2024-11-18 11:02:43 +0100
  • 937684c822
    metal : add GGML_UNARY_OP_ELU kernel (ggml/1018) PAB 2024-11-18 10:02:49 +0100
  • 58b5fc45b9
    CUDA: fix MMV kernel being used for FP16 src1 (llama/10357) Johannes Gäßler 2024-11-17 23:20:42 +0100
  • fcd8ea6aff
    CMake: fix typo in comment [no ci] (llama/10360) Johannes Gäßler 2024-11-17 12:59:38 +0100
  • 6b4de57e65
    llama : only use default buffer types for the KV cache (llama/10358) Diego Devesa 2024-11-17 12:25:45 +0100
  • dca00d8374
    metal : refactor kernel args into structs (llama/10238) Georgi Gerganov 2024-11-17 11:23:01 +0200
  • a901ba0716
    ggml : fix undefined reference to 'getcpu' (llama/10354) FirstTimeEZ 2024-11-17 21:39:22 +1300
  • 8bd8688888
    CUDA: remove DMMV, consolidate F16 mult mat vec (llama/10318) Johannes Gäßler 2024-11-17 09:09:55 +0100
  • 77ea626d26
    CMake: default to -arch=native for CUDA build (llama/10320) Johannes Gäßler 2024-11-17 09:06:34 +0100
  • c96434f2b3
    ggml : fix possible buffer use after free in sched reserve (llama/9930) Diego Devesa 2024-11-17 07:31:17 +0100
  • 3f1a78d6f8
    ggml : inttypes.h -> cinttypes (llama/0) Georgi Gerganov 2024-11-16 23:40:39 +0200
  • 600728ea21
    ggml : adapt AMX to tensor->grad removal (llama/0) Georgi Gerganov 2024-11-16 21:38:01 +0200
  • e726307095
    ggml : fix compile warnings (llama/0) Georgi Gerganov 2024-11-16 21:32:41 +0200
  • 7caa6b2e83
    llamafile : fix include path (llama/0) Georgi Gerganov 2024-11-16 17:58:56 +0200
  • 68b198b438
    vulkan: Optimize some mat-vec mul quant shaders (llama/10296) Jeff Bolz 2024-11-16 00:26:57 -0600
  • 49ca4814be
    ggml : optimize Q4_0 into Q4_0_X_Y repack (llama/10324) Dan Johansson 2024-11-16 01:53:37 +0100
  • 4b8ddfbda7
    Make updates to fix issues with clang-cl builds while using AVX512 flags (llama/10314) Srihari-mcw 2024-11-16 02:57:00 +0530
  • adf81dc329
    ggml: new optimization interface (ggml/988) Johannes Gäßler 2024-11-16 13:49:35 +0100
  • f33c7ea0c5
    ggml : remove duplicated sources from the last sync (ggml/1017) Georgi Gerganov 2024-11-15 23:52:31 +0200
  • 83c77397e4
    ggml : fix some build issues slaren 2024-11-15 20:20:54 +0100
  • 8dffd6444c
    sync : leftovers (ggml/0) Georgi Gerganov 2024-11-15 21:43:41 +0200
  • 1d49a2e7a2
    cmake : restore CMakeLists.txt (llama/10256) Georgi Gerganov 2024-11-15 21:35:51 +0200
  • 0df66d6586
    AVX BF16 and single scale quant optimizations (llama/10212) Eve 2024-11-15 11:47:58 +0000
  • 04d1bae6d4
    sycl: Use syclcompat::dp4a (llama/10267) Romain Biessy 2024-11-15 04:09:12 +0100
  • 41c90650a2
    backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (llama/9921) Charles Xu 2024-11-15 01:28:50 +0100
  • ce58be7e79
    ggml : build backends as libraries (llama/10256) Diego Devesa 2024-11-14 18:04:35 +0100
  • 06c86c03d8
    scripts : update sync Georgi Gerganov 2024-11-19 18:59:18 +0200
  • 6266a9f9e5
    release : v1.7.2 v1.7.2 Georgi Gerganov 2024-11-19 18:54:22 +0200
  • d24f981fb2
    sycl: fix example build (#2570) Stefan Sydow 2024-11-18 13:57:23 +0100
  • 4187c6ca19
    sycl: fix example build Stefan Sydow 2024-11-08 22:13:29 +0100
  • c5b9b546b8
    docs: Update README.md for whisper.objc app Tomer Schlesinger 2024-11-17 20:41:48 +0200
  • 01d3bd7d5c
    ci : use local ggml in Android build (#2567) Georgi Gerganov 2024-11-16 20:45:41 +0200
  • 511579cc15
    ci : use local ggml gg/ci-fix-android Georgi Gerganov 2024-11-16 20:31:57 +0200
  • bb12cd9b77
    ggml : tmp workaround for whisper.cpp (skip) (#2565) Georgi Gerganov 2024-11-16 20:19:02 +0200
  • f02b40bcb4
    update : readme v1.7.2-pre Georgi Gerganov 2024-11-15 16:00:10 +0200
  • 83ac2842bd
    scripts : fix sync path Georgi Gerganov 2024-11-15 15:24:09 +0200
  • c4e95fb74d
    whisper.swiftui : switch Mac dest to Mac (Designed for iPad) (#2562) Jhen-Jie Hong 2024-11-15 21:21:53 +0800
  • e23721f3fb cmake : fix ppc64 check (#0) Georgi Gerganov 2024-11-15 09:04:34 +0200
  • c0a9f8ef85 whisper : include ggml-cpu.h (#0) Georgi Gerganov 2024-11-15 11:01:47 +0200
  • 6477b84eb6 build : fixes Georgi Gerganov 2024-11-15 09:07:53 +0200
  • 24d706774d talk-llama : sync llama.cpp Georgi Gerganov 2024-11-15 08:41:06 +0200
  • 5089ab2d6a whisper : fix build (#0) Georgi Gerganov 2024-11-15 08:40:47 +0200
  • bdbb906817 sync : ggml Georgi Gerganov 2024-11-15 08:40:34 +0200
  • fa2ebd336e sycl : Fixes to broken builds and test-backend-ops (llama/10257) Alberto Cabrera Pérez 2024-11-13 09:40:57 +0000
  • 21b01a21b6 vulkan: Optimize contiguous copies (llama/10254) Jeff Bolz 2024-11-13 00:58:57 -0600
  • b54ce5edc5 vulkan: Throttle the number of shader compiles during the build step. (llama/10222) Jeff Bolz 2024-11-11 11:13:51 -0600
  • 26a31b78e9 metal : more precise Q*K in FA vec kernel (llama/10247) Georgi Gerganov 2024-11-11 08:39:13 +0200
  • 14d13c5f9f vulkan: Fix newly added tests for permuted mul_mat and 1D im2col (llama/10226) Jeff Bolz 2024-11-10 05:37:56 -0600
  • 5e110c2eb5 metal : reorder write loop in mul mat kernel + style (llama/10231) Georgi Gerganov 2024-11-09 11:53:13 +0200
  • 4a9926d521 metal : fix build and some more comments (llama/10229) Georgi Gerganov 2024-11-09 11:53:02 +0200
  • ae3c5642d0 metal : fix F32 accumulation in FA vec kernel (llama/10232) Georgi Gerganov 2024-11-09 11:52:45 +0200
  • e287a3b627 metal : hide debug messages from normal log Georgi Gerganov 2024-11-09 11:21:49 +0200
  • b890243690 ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL operator when ‘ne’ is small (#10213) SXX 2024-11-09 15:35:46 +0800
  • b7b38f7d68 ggml : optimize llamafile cpu matrix multiplication for ppc64le (llama/10156) amritahs-ibm 2024-11-09 12:47:50 +0530
  • 9f67aab211 metal : opt-in compile flag for BF16 (llama/10218) Georgi Gerganov 2024-11-08 21:59:46 +0200