-
Notifications
You must be signed in to change notification settings - Fork 19.9k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
openvino: Update to OV 2026.2.1, self-contained release packages, operator improvements
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
OpenVINO
#24974
opened Jun 24, 2026 by
ravi9
Contributor
Loading…
ggml: add Rockchip NPU (RKNPU2) backend for RK3588
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
vibe-coded
Created with heavy use of LLM assistants, requires human verification
#24972
opened Jun 24, 2026 by
alexinthesky
Loading…
2 of 5 tasks
common : honor case_sensitive argument in jinja sort and dictsort
jinja parser
Issues related to the jinja parser
testing
Everything test related
#24971
opened Jun 24, 2026 by
dnislno
Loading…
mtmd: add unlimited-ocr (converter, full MHA)
examples
python
python script changes
#24969
opened Jun 24, 2026 by
sfallah
Contributor
Loading…
common: remove unused json-partial
testing
Everything test related
#24968
opened Jun 24, 2026 by
ngxson
Collaborator
Loading…
vulkan: disable MMVQ on AMD UMA devices
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#24966
opened Jun 24, 2026 by
winstonma
Contributor
Loading…
opencl: support non-contig rows in norm
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
Improve Server OAI Responses API streaming compatibility
examples
server
#24957
opened Jun 23, 2026 by
boondocklabs
Contributor
Loading…
server : create context checkpoint on slot restore
examples
server
#24956
opened Jun 23, 2026 by
julio50
Loading…
hexagon: MUL_MAT and MUL_MAT_ID rework : 32x32 tiled weight repack, kernel-params, cached graphs
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
Hexagon
python
python script changes
script
Script related
testing
Everything test related
#24954
opened Jun 23, 2026 by
max-krasnyansky
Member
Loading…
bench: Fix misc. bug #24951 - Standard Deviation issues
examples
#24953
opened Jun 23, 2026 by
surfidaho
Loading…
refactor(server): move speculative init to speculative.cpp
examples
server
#24952
opened Jun 23, 2026 by
wadealexc
Loading…
cli : move to HTTP-based implementation
examples
server
#24948
opened Jun 23, 2026 by
ngxson
Collaborator
Loading…
cuda : prevent integer truncation and overflow errors when using KQ mask strides in flash_attn_mask_to_KV_max kernel
CUDA
Related to the CUDA backend
ggml
changes relating to the ggml tensor library for machine learning
#24945
opened Jun 23, 2026 by
fairydreaming
Collaborator
Loading…
server : disable embeddings/pooling on the speculative draft/MTP context
examples
server
#24942
opened Jun 23, 2026 by
liminfei-amd
Contributor
Loading…
1 task done
sycl : clamp softmax input to avoid underflow
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24941
opened Jun 23, 2026 by
Jassieluo
Loading…
llama : meta split state for combined gate+up ffn_up (phi3)
#24936
opened Jun 23, 2026 by
krystophny
Loading…
llama : synchronize context before backend teardown
testing
Everything test related
#24935
opened Jun 23, 2026 by
krystophny
Loading…
Lllama sampler speedups/fixes
testing
Everything test related
#24934
opened Jun 23, 2026 by
matthiasstraka
Contributor
Loading…
docs: note ROCm HIP SDK 7.1.1 + MSVC >=14.40 Improvements or additions to documentation
<cmath> build failure and workaround (Windows)
documentation
#24929
opened Jun 23, 2026 by
Dixon-Cider
Loading…
Fix conditional to display 'LLAMA_SPLIT_MODE_TENSOR not implemented for architecture' message
#24926
opened Jun 23, 2026 by
kdkd
Loading…
Minimax M3 EAGLE3 Support
model
Model specific
python
python script changes
testing
Everything test related
#24925
opened Jun 23, 2026 by
nick-tonjum
Loading…
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.