Commit Graph

  • 49c33aa40d
    Fix typo in download-ggml-model.sh Michael Rienstra 2024-12-11 13:49:59 -0800
  • 262e865a70
    ruby : Sync whisper.cpp and model download feature (#2617) KITAITI Makoto 2024-12-09 20:17:50 +0900
  • d4e47945e3 Use conditional get when get model files Kitaiti Makoto 2024-12-03 23:15:15 +0900
  • a0f3d8a831 Cosmetic fix Kitaiti Makoto 2024-12-01 08:26:47 +0900
  • 4559a70035 Don't care about no longer included file Kitaiti Makoto 2024-12-01 08:24:49 +0900
  • b8a5c85780 Remove unused function Kitaiti Makoto 2024-12-01 08:04:27 +0900
  • 0ed5b2399c Add headings to API section in README [skip ci] Kitaiti Makoto 2024-12-01 02:24:03 +0900
  • 9e50697dc1 Update documents Kitaiti Makoto 2024-12-01 02:10:01 +0900
  • d8d89d73e4 Add shorthand for pre-converted models Kitaiti Makoto 2024-12-01 02:09:44 +0900
  • 3fd13ae71f Make Whisper::Context#initialize accept Pathname Kitaiti Makoto 2024-11-29 22:09:55 +0900
  • d862e8359c Add test for Pathname of model Kitaiti Makoto 2024-11-29 22:05:52 +0900
  • b53b44e0ff Use C++17 Kitaiti Makoto 2024-12-06 23:05:42 +0900
  • ed733e85a1
    scripts : update to new build system v1.7.3-pre Georgi Gerganov 2024-12-09 11:30:16 +0200
  • 5980b1ae77
    devops : add cmake Georgi Gerganov 2024-12-08 23:09:26 +0200
  • 0415a66044
    devops : update make commands Georgi Gerganov 2024-12-08 23:07:29 +0200
  • 7d134e3737
    ggml : remove old files (skip) (#0) Georgi Gerganov 2024-12-08 23:04:26 +0200
  • 9df53b357e
    ggml : sync remnants (skip) (#0) Georgi Gerganov 2024-12-08 22:48:25 +0200
  • b2115b4d9b
    scripts : remove amx from sync Georgi Gerganov 2024-12-08 22:48:14 +0200
  • 0164427dd5 ci : disable freeBSD builds [no ci] Georgi Gerganov 2024-12-08 15:52:57 +0200
  • 627b11c78a readme : update build instructions Georgi Gerganov 2024-12-08 15:48:14 +0200
  • 472464453d ci : disable CUDA and Android builds Georgi Gerganov 2024-12-08 15:36:01 +0200
  • 11dddfbc9e ci : disable Obj-C build + fixes Georgi Gerganov 2024-12-08 13:35:35 +0200
  • 384e214cc7 make : shim cmake Georgi Gerganov 2024-12-06 15:34:53 +0200
  • f2c680f893 talk-llama : sync llama.cpp Georgi Gerganov 2024-12-05 14:30:33 +0200
  • fbe66da0e5 sync : ggml Georgi Gerganov 2024-12-05 14:29:18 +0200
  • a815940e0e ggml : add predefined list of CPU backend variants to build (llama/10626) Diego Devesa 2024-12-04 14:45:40 +0100
  • 904e307bce ggml-cpu : fix HWCAP2_I8MM value (llama/10646) Diego Devesa 2024-12-04 14:40:44 +0100
  • 491ec076b4 vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642) Jeff Bolz 2024-12-04 01:28:59 -0600
  • 966433fdf2 SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (llama/10584) Nicolò Scipione 2024-12-04 02:29:20 +0100
  • 6f1ba9d82d Avoid using __fp16 on ARM with old nvcc (llama/10616) Frankie Robertson 2024-12-04 02:41:37 +0200
  • 015ecd0001 vulkan: optimize and reenable split_k (llama/10637) Jeff Bolz 2024-12-03 13:29:54 -0600
  • b7c64a4352 ggml: add GGML_SET Metal kernel + i32 CPU kernel (ggml/1037) PAB 2024-12-04 09:19:30 +0100
  • 7895d39508 ggml : add GGML_PAD_REFLECT_1D operation (ggml/1034) PAB 2024-12-03 20:20:04 +0100
  • 22616f00f9 files : remove make artifacts Georgi Gerganov 2024-12-03 20:29:32 +0200
  • 02c6fcbc2c common : fix compile warning Georgi Gerganov 2024-12-03 20:25:37 +0200
  • 3daeacad24 ggml : move AMX to the CPU backend (llama/10570) Diego Devesa 2024-12-03 20:22:12 +0200
  • 4d73962da4 metal : small-batch mat-mul kernels (llama/10581) Georgi Gerganov 2024-12-03 11:52:33 +0200
  • 068812650e SYCL: Fix and switch to GGML_LOG system instead of fprintf (llama/10579) Akarshan Biswas 2024-12-02 12:34:11 +0530
  • 4b7e059e15 ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (llama/10567) Adrien Gallouët 2024-11-30 18:13:18 +0100
  • 30e35d7271 vulkan: Dynamic subgroup size support for Q6_K mat_vec (llama/10536) Eve 2024-11-30 07:00:02 +0000
  • 3623bd58f2 ggml : fix I8MM Q4_1 scaling factor conversion (llama/10562) Georgi Gerganov 2024-11-29 16:25:39 +0200
  • cb847c20a7 ggml-cpu: fix typo in gemv/gemm iq4_nl_4_4 (llama/10580) Shupei Fan 2024-11-29 21:49:02 +0800
  • 964b154a2a sycl : offload of get_rows set to 0 (llama/10432) Alberto Cabrera Pérez 2024-11-29 12:38:45 +0000
  • d7c2a04bce sycl : Reroute permuted mul_mats through oneMKL (llama/10408) Alberto Cabrera Pérez 2024-11-29 09:49:43 +0000
  • 2bb4ca9cba CANN: RoPE operator optimization (llama/10563) Chenguang Li 2024-11-29 14:46:55 +0800
  • a753a82462 vulkan: get the first command buffer submitted sooner (llama/10499) Jeff Bolz 2024-11-29 00:18:02 -0600
  • 276b08d8f0 ggml : remove redundant copyright notice + update authors Georgi Gerganov 2024-11-28 20:46:40 +0200
  • 4ca1e72fe0 ggml : fix row condition for i8mm kernels (llama/10561) Georgi Gerganov 2024-11-28 14:56:37 +0200
  • 16a66f103f cmake : fix ARM feature detection (llama/10543) Georgi Gerganov 2024-11-28 14:56:23 +0200
  • 330273901f ggml-cpu: support IQ4_NL_4_4 by runtime repack (llama/10541) Shupei Fan 2024-11-28 20:52:03 +0800
  • 42099a9342 kompute : improve backend to pass test_backend_ops (llama/10542) Sergio López 2024-11-28 12:51:38 +0100
  • 90dd5fca9c CANN: Fix SOC_TYPE compile bug (llama/10519) leo-pony 2024-11-28 15:25:24 +0800
  • 2490f2a7f8 CANN: ROPE operator optimization (llama/10540) Chenguang Li 2024-11-28 14:24:46 +0800
  • 230e985633 Add some minimal optimizations for CDNA (llama/10498) uvos 2024-11-27 17:10:08 +0100
  • ae24083f23 metal : fix group_norm support condition (llama/0) Georgi Gerganov 2024-11-27 11:22:14 +0200
  • 6463e36369 vulkan: define all quant data structures in types.comp (llama/10440) Jeff Bolz 2024-11-27 01:32:54 -0600
  • b3301f7d82 vulkan: Handle GPUs with less shared memory (llama/10468) Jeff Bolz 2024-11-27 01:30:27 -0600
  • ab5d4d93ec vulkan: further optimize q5_k mul_mat_vec (llama/10479) Jeff Bolz 2024-11-27 01:21:59 -0600
  • 2d6e9dd723 vulkan: skip integer div/mod in get_offsets for batch_idx==0 (llama/10506) Jeff Bolz 2024-11-27 01:08:54 -0600
  • 2f16e51553 vulkan: optimize Q2_K and Q3_K mul_mat_vec (llama/10459) Jeff Bolz 2024-11-27 01:00:50 -0600
  • 0f0994902f mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (llama/10516) R0CKSTAR 2024-11-27 00:00:41 +0800
  • 5e1fcc1780 vulkan: fix group_norm (llama/10496) Jeff Bolz 2024-11-26 09:45:05 -0600
  • 48f421de23 cmake : enable warnings in llama (llama/10474) Georgi Gerganov 2024-11-26 14:18:08 +0200
  • e7afb2b991 ggml-cpu: cmake add arm64 cpu feature check for macos (llama/10487) Charles Xu 2024-11-26 12:37:05 +0100
  • 9a5ef7b169 CANN: Improve the Inferencing Performance for Ascend NPU Device (llama/10454) Shanshan Shen 2024-11-26 18:08:37 +0800
  • 453cc0fcf1 CANN: RoPE and CANCAT operator optimization (llama/10488) Chenguang Li 2024-11-26 17:31:05 +0800
  • 78dfec6bc5 vulkan: Fix a vulkan-shaders-gen arugment parsing error (llama/10484) Junil Kim 2024-11-26 10:47:20 +0900
  • f6d518fc4c metal : enable mat-vec kernels for bs <= 4 (llama/10491) Georgi Gerganov 2024-11-25 21:49:31 +0200
  • ac33379a35 llama : accept a list of devices to use to offload a model (llama/10497) Diego Devesa 2024-11-25 19:30:06 +0100
  • 77e3e4a090 ggml : add support for dynamic loading of backends (llama/10469) Diego Devesa 2024-11-25 15:13:39 +0100
  • b840bb09be metal : minor code formatting Georgi Gerganov 2024-11-25 15:08:04 +0200
  • 8b1c1c30a7 ggml : do not use ARM features not included in the build (llama/10457) Diego Devesa 2024-11-23 14:41:12 +0100
  • 4b81335f75 CANN: Support Ascend310P to accelerate F32 and F16 Model (llama/10216) leo-pony 2024-11-22 14:07:20 +0800
  • 2a4b5c9d7e cuda : optimize argmax (llama/10441) Diego Devesa 2024-11-21 18:18:50 +0100
  • 04662748aa vulkan: predicate max operation in soft_max shaders/soft_max (llama/10437) Jeff Bolz 2024-11-20 13:47:36 -0600
  • a117279e13 vulkan: copy iq4_nl LUT into shared memory (llama/10409) Jeff Bolz 2024-11-20 01:40:18 -0600
  • bbb292ed38 vulkan: further optimize mul_mat_vec using larger loads (llama/10387) Jeff Bolz 2024-11-20 01:11:00 -0600
  • 95e8901e71 add cmake rvv support (llama/10411) haopeng 2024-11-20 04:10:31 +0800
  • 4af9626702 CUDA: remove unnecessary warp reduce in FA (ggml/1032) mahorozte 2024-12-03 21:11:43 +0800
  • c52d1035de feat: add GGML_UNARY_OP_ARGMAX Metal kernel (ggml/1019) PAB 2024-12-02 19:27:24 +0100
  • 5773a14980 metal : add GGML_OP_CONV_TRANSPOSE_1D kernels (ggml/1026) PAB 2024-11-28 09:25:06 +0100
  • 6939147c47 Do not include arm_neon.h when compiling CUDA code (ggml/1028) Frankie Robertson 2024-11-26 15:50:26 +0200
  • 98f9916c9f ggml-opt: fix data corruption (ggml/1022) Johannes Gäßler 2024-11-20 14:56:04 +0100
  • 280d2735bc
    ci : disable freeBSD builds [no ci] Georgi Gerganov 2024-12-08 15:52:57 +0200
  • 668930a989
    readme : update build instructions Georgi Gerganov 2024-12-08 15:48:14 +0200
  • 762f63e2d0
    ci : disable CUDA and Android builds Georgi Gerganov 2024-12-08 15:36:01 +0200
  • a5cd03a921
    ci : disable Obj-C build + fixes Georgi Gerganov 2024-12-08 13:35:35 +0200
  • e3d545e5a6 Fix vulkan Makefile paths Lluís Batlle i Rossell 2024-12-07 12:27:48 +0100
  • ae769eae71 Use C++17 Kitaiti Makoto 2024-12-06 23:05:42 +0900
  • 729effe4cf
    make : shim cmake Georgi Gerganov 2024-12-06 15:34:53 +0200
  • 1a1fcd37cf
    talk-llama : sync llama.cpp Georgi Gerganov 2024-12-05 14:30:33 +0200
  • dfe6652b0d
    sync : ggml Georgi Gerganov 2024-12-05 14:29:18 +0200
  • dfddca02ec
    ggml : add predefined list of CPU backend variants to build (llama/10626) Diego Devesa 2024-12-04 14:45:40 +0100
  • 61aff48839
    ggml-cpu : fix HWCAP2_I8MM value (llama/10646) Diego Devesa 2024-12-04 14:40:44 +0100
  • b311da34cf
    vulkan: Implement "fast divide" (mul+shift) for unary ops like copy (llama/10642) Jeff Bolz 2024-12-04 01:28:59 -0600
  • 3085e2883a
    SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (llama/10584) Nicolò Scipione 2024-12-04 02:29:20 +0100
  • 9623ba19b2
    Avoid using __fp16 on ARM with old nvcc (llama/10616) Frankie Robertson 2024-12-04 02:41:37 +0200
  • 03331b1de8
    vulkan: optimize and reenable split_k (llama/10637) Jeff Bolz 2024-12-03 13:29:54 -0600
  • e20efac003
    ggml: add GGML_SET Metal kernel + i32 CPU kernel (ggml/1037) PAB 2024-12-04 09:19:30 +0100
  • 40d5987bf3
    ggml : add GGML_PAD_REFLECT_1D operation (ggml/1034) PAB 2024-12-03 20:20:04 +0100