Commit Graph

  • fe9b27d0af
    files : remove make artifacts Georgi Gerganov 2024-12-03 20:29:32 +0200
  • 76199eeac3
    common : fix compile warning Georgi Gerganov 2024-12-03 20:25:37 +0200
  • b383af9150
    ggml : move AMX to the CPU backend (llama/10570) Diego Devesa 2024-12-03 20:22:12 +0200
  • b40568327f
    metal : small-batch mat-mul kernels (llama/10581) Georgi Gerganov 2024-12-03 11:52:33 +0200
  • 59853e7f62
    SYCL: Fix and switch to GGML_LOG system instead of fprintf (llama/10579) Akarshan Biswas 2024-12-02 12:34:11 +0530
  • 06bf264158
    ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (llama/10567) Adrien Gallouët 2024-11-30 18:13:18 +0100
  • 621659c9bc
    vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536) Eve 2024-11-30 07:00:02 +0000
  • 8383be9ea3
    ggml : fix I8MM Q4_1 scaling factor conversion (llama/10562) Georgi Gerganov 2024-11-29 16:25:39 +0200
  • 544a4d4c04
    ggml-cpu: fix typo in gemv/gemm iq4_nl_4_4 (llama/10580) Shupei Fan 2024-11-29 21:49:02 +0800
  • 5acee88336
    sycl : offload of get_rows set to 0 (llama/10432) Alberto Cabrera Pérez 2024-11-29 12:38:45 +0000
  • 957e21eb05
    sycl : Reroute permuted mul_mats through oneMKL (llama/10408) Alberto Cabrera Pérez 2024-11-29 09:49:43 +0000
  • 29349f1e62
    CANN: RoPE operator optimization (llama/10563) Chenguang Li 2024-11-29 14:46:55 +0800
  • febda2f686
    vulkan: get the first command buffer submitted sooner (llama/10499) Jeff Bolz 2024-11-29 00:18:02 -0600
  • b8a6761b01
    ggml : remove redundant copyright notice + update authors Georgi Gerganov 2024-11-28 20:46:40 +0200
  • decea57e76
    ggml : fix row condition for i8mm kernels (llama/10561) Georgi Gerganov 2024-11-28 14:56:37 +0200
  • 0712712320
    cmake : fix ARM feature detection (llama/10543) Georgi Gerganov 2024-11-28 14:56:23 +0200
  • 23468e6400
    ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541) Shupei Fan 2024-11-28 20:52:03 +0800
  • 68e48d3466
    kompute : improve backend to pass test_backend_ops (llama/10542) Sergio López 2024-11-28 12:51:38 +0100
  • 3536fd2227
    CANN: Fix SOC_TYPE compile bug (llama/10519) leo-pony 2024-11-28 15:25:24 +0800
  • bb85fccfb0
    CANN: ROPE operator optimization (llama/10540) Chenguang Li 2024-11-28 14:24:46 +0800
  • d147926ea5
    Add some minimal optimizations for CDNA (llama/10498) uvos 2024-11-27 17:10:08 +0100
  • 98690a8ff3
    metal : fix group_norm support condition (llama/0) Georgi Gerganov 2024-11-27 11:22:14 +0200
  • 5aad67da87
    vulkan: define all quant data structures in types.comp (llama/10440) Jeff Bolz 2024-11-27 01:32:54 -0600
  • 6ecca7d1bf
    vulkan: Handle GPUs with less shared memory (llama/10468) Jeff Bolz 2024-11-27 01:30:27 -0600
  • 475517a329
    vulkan: further optimize q5_k mul_mat_vec (llama/10479) Jeff Bolz 2024-11-27 01:21:59 -0600
  • f69379f571
    vulkan: skip integer div/mod in get_offsets for batch_idx==0 (llama/10506) Jeff Bolz 2024-11-27 01:08:54 -0600
  • 6ebd263418
    vulkan: optimize Q2_K and Q3_K mul_mat_vec (llama/10459) Jeff Bolz 2024-11-27 01:00:50 -0600
  • 5ec2241255
    mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (llama/10516) R0CKSTAR 2024-11-27 00:00:41 +0800
  • 1e5fe6a8fd
    vulkan: fix group_norm (llama/10496) Jeff Bolz 2024-11-26 09:45:05 -0600
  • a116add0ba
    cmake : enable warnings in llama (llama/10474) Georgi Gerganov 2024-11-26 14:18:08 +0200
  • b054b831fb
    ggml-cpu: cmake add arm64 cpu feature check for macos (llama/10487) Charles Xu 2024-11-26 12:37:05 +0100
  • a7103bbd94
    CANN: Improve the Inferencing Performance for Ascend NPU Device (llama/10454) Shanshan Shen 2024-11-26 18:08:37 +0800
  • c211ba423c
    CANN: RoPE and CANCAT operator optimization (llama/10488) Chenguang Li 2024-11-26 17:31:05 +0800
  • 8eb23b94fe
    vulkan: Fix a vulkan-shaders-gen arugment parsing error (llama/10484) Junil Kim 2024-11-26 10:47:20 +0900
  • 9229dd6286
    metal : enable mat-vec kernels for bs <= 4 (llama/10491) Georgi Gerganov 2024-11-25 21:49:31 +0200
  • 640732f2a1
    llama : accept a list of devices to use to offload a model (llama/10497) Diego Devesa 2024-11-25 19:30:06 +0100
  • 920a48a7e5
    ggml : add support for dynamic loading of backends (llama/10469) Diego Devesa 2024-11-25 15:13:39 +0100
  • 5be17d670a
    metal : minor code formatting Georgi Gerganov 2024-11-25 15:08:04 +0200
  • cd3456d61c
    ggml : do not use ARM features not included in the build (llama/10457) Diego Devesa 2024-11-23 14:41:12 +0100
  • 761a3e8ae7
    CANN: Support Ascend310P to accelerate F32 and F16 Model (llama/10216) leo-pony 2024-11-22 14:07:20 +0800
  • cbfbf5fa98
    cuda : optimize argmax (llama/10441) Diego Devesa 2024-11-21 18:18:50 +0100
  • 176d6896e6
    vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437) Jeff Bolz 2024-11-20 13:47:36 -0600
  • b4fa978bab
    vulkan: copy iq4_nl LUT into shared memory (llama/10409) Jeff Bolz 2024-11-20 01:40:18 -0600
  • 133fb12e0b
    vulkan: further optimize mul_mat_vec using larger loads (llama/10387) Jeff Bolz 2024-11-20 01:11:00 -0600
  • b1c6a66f66
    add cmake rvv support (llama/10411) haopeng 2024-11-20 04:10:31 +0800
  • 2404fcaf8d
    CUDA: remove unnecessary warp reduce in FA (ggml/1032) mahorozte 2024-12-03 21:11:43 +0800
  • 2b86e5931f
    feat: add GGML_UNARY_OP_ARGMAX Metal kernel (ggml/1019) PAB 2024-12-02 19:27:24 +0100
  • 2a2ed50a6c
    metal : add GGML_OP_CONV_TRANSPOSE_1D kernels (ggml/1026) PAB 2024-11-28 09:25:06 +0100
  • 3445025904
    Do not include arm_neon.h when compiling CUDA code (ggml/1028) Frankie Robertson 2024-11-26 15:50:26 +0200
  • 62fd128c40
    ggml-opt: fix data corruption (ggml/1022) Johannes Gäßler 2024-11-20 14:56:04 +0100
  • f07ec1bc17
    Update soft_max.comp gn64 2024-12-03 06:17:05 +0900
  • a604352d01
    Merge 99cc7f9566 into 021eef1000 Mohammadreza Hendiani 2024-12-02 03:18:59 +0000
  • 759ef4b405
    Merge fbb8bc6c2f into 021eef1000 Shi Liang 2024-11-29 23:07:52 +0100
  • 13cf1b6a7e
    Merge b67bdc9430 into 021eef1000 Georgi Gerganov 2024-11-28 10:14:27 +0000
  • 021eef1000
    ruby : Add low-level methods to transcribe (#2585) KITAITI Makoto 2024-11-28 17:33:07 +0900
  • a9d06ce151
    models : add q8_0 models to download-ggml-model.sh (#2589) Michael Rienstra 2024-11-28 00:31:54 -0800
  • 2e1fb518d1 Add VULKAN option for cmake and doc in readme DrEmixam 2024-11-27 05:29:19 +0100
  • 7d5936f8b8 Remove unused code Kitaiti Makoto 2024-11-26 14:51:55 +0900
  • 9c77e8f3ff
    Add q8_0 models to download-ggml-model.sh Michael Rienstra 2024-11-25 12:19:27 -0800
  • 4ce87c47f4 Update README Kitaiti Makoto 2024-11-24 21:01:07 +0900
  • 77c559d0dc Update README Kitaiti Makoto 2024-11-24 19:55:12 +0900
  • e85dac0121 Add comment on samples data type Kitaiti Makoto 2024-11-24 19:51:39 +0900
  • af420ca297 Use Ruby 3.1 on CI Kitaiti Makoto 2024-11-24 19:48:10 +0900
  • 46199eceed Make Whisper::Context#full and #full_parallel accept MemoryView Kitaiti Makoto 2024-11-23 22:23:56 +0900
  • 70ce2a021c Add test for MemoryView Kitaiti Makoto 2024-11-23 22:21:13 +0900
  • aa3c9abb6e Build test class before running test Kitaiti Makoto 2024-11-23 22:18:30 +0900
  • 3fde7f3163 Add class to test MemoryView Kitaiti Makoto 2024-11-23 22:18:10 +0900
  • 71b1c3f86e Hide Whisper's instance methods from Ruby code Kitaiti Makoto 2024-11-16 21:06:20 +0900
  • 380838dce8 Add Whisper::Context#full_parallel Kitaiti Makoto 2024-11-16 04:19:26 +0900
  • f5753ce863 Add test for Whisper::Context#full_parallel Kitaiti Makoto 2024-11-16 04:19:20 +0900
  • a1066c467d Add description to Whisper::Context#full Kitaiti Makoto 2024-11-16 04:18:58 +0900
  • 79ec5498b7 Add additional signature for Whisper::Context#full Kitaiti Makoto 2024-11-14 18:56:23 +0900
  • ff44e911bb Add document of Whisper::Context#full [skip ci] Kitaiti Makoto 2024-11-14 18:52:47 +0900
  • f71e330d6c Add tests for Whisper::Error Kitaiti Makoto 2024-11-14 18:37:10 +0900
  • fbc4d0d07f Add Whisper::Context#full Kitaiti Makoto 2024-11-14 18:37:01 +0900
  • 4ddb3f3528 Add tests for Whisper::Context#full Kitaiti Makoto 2024-11-14 08:08:43 +0900
  • 8c6a9b8bb6
    ruby : Follow source tree change (#2580) KITAITI Makoto 2024-11-22 00:04:29 +0900
  • 628e404425 Use ternary operator Kitaiti Makoto 2024-11-21 22:50:51 +0900
  • 7b90773d47 Use GitHub workflow setting for dependency definition Kitaiti Makoto 2024-11-21 22:45:01 +0900
  • 67a67a5b7b Fix paths in GitHub workflow for Ruby bindings Kitaiti Makoto 2024-11-21 22:28:18 +0900
  • b0aeef2d52
    ci : fix windows builds to use 2019 gg/ci-fix-windows Georgi Gerganov 2024-11-21 14:28:14 +0200
  • 647696eef5 Follow whisper.cpp log level change Kitaiti Makoto 2024-11-21 21:26:04 +0900
  • f31aa20843 Update whispercpp.gemspec Kitaiti Makoto 2024-11-21 21:21:55 +0900
  • 5b9997c424 Follow whisper.cpp source tree change Kitaiti Makoto 2024-11-21 21:14:28 +0900
  • 60c293e943 openvino : Pass CPU threads parameter Karthick J 2024-11-21 12:43:14 +0530
  • b67bdc9430
    disable gg/objc Georgi Gerganov 2024-11-20 23:18:58 +0200
  • 5e966f7844
    try3 Georgi Gerganov 2024-11-20 22:02:49 +0200
  • 54005478af
    try2 Georgi Gerganov 2024-11-20 21:42:58 +0200
  • 49c389b40a
    examples : try to fix objc CI Georgi Gerganov 2024-11-20 21:28:43 +0200
  • 37c88027e1 whisper : use backend registry (#0) Georgi Gerganov 2024-11-20 15:32:34 +0200
  • 9db070a3c5 ggml/sched : do not skip views in pre-assignments slaren 2024-11-20 13:25:08 +0100
  • 7fd8d9c220 whisper : adapt to new ggml (wip) Georgi Gerganov 2024-11-19 19:09:07 +0200
  • 06e059b8f8 talk-llama : sync llama.cpp Georgi Gerganov 2024-11-19 19:08:57 +0200
  • c9f49d5f9d sync : ggml Georgi Gerganov 2024-11-19 19:04:21 +0200
  • f4c1d7df39 ggml : sync resolve (skip) (#0) Georgi Gerganov 2024-11-19 19:03:47 +0200
  • 339b8e559c Add required ggml-base and backend libs to cmake pkg (llama/10407) bandoti 2024-11-19 12:10:30 -0400
  • 5f6d6919b4 cuda : fix CUDA_FLAGS not being applied (llama/10403) Diego Devesa 2024-11-19 14:29:38 +0100
  • 8ee767732f sycl : Add option to set the SYCL architecture for all targets (llama/10266) Romain Biessy 2024-11-19 09:02:23 +0100
  • 45f1f9144f vulkan: Optimize soft_max (llama/10301) Jeff Bolz 2024-11-19 01:25:17 -0600
  • 53589c8f12 sycl: Revert MUL_MAT_OP support changes (llama/10385) Alberto Cabrera Pérez 2024-11-19 00:50:04 +0000