Upgrade vendored llama.cpp to b9585 by mcharytoniuk · Pull Request #20 · intentee/llama-cpp-bindings

mcharytoniuk · 2026-06-10T17:21:26Z

Upgrades the vendored llama.cpp submodule to the official b9585 tag, adds a per-model chat-parser cache, and removes all clippy allow/expect suppressions.

🤖 Generated with Claude Code

…and clippy cleanup

Copilot

Pull request overview

This PR updates the vendored llama.cpp integration and adapts the Rust bindings/tests to match the new upstream APIs. It introduces a per-LlamaModel cached chat parser handle (to avoid repeated template analysis), updates chat-template application to support an enable_thinking flag, and replaces a multi-argument multimodal evaluation call with a parameter struct. It also centralizes several Clippy suppressions into per-crate Cargo.toml lint config.

Changes:

Upgrade the vendored llama.cpp interface and adjust C++ wrappers (chat parsing, chat template application, MTMD bitmap init, and build defines).
Add a per-model cached chat parser in LlamaModel and update parsing/apply APIs (including an enable_thinking switch).
Refactor multimodal chunk evaluation parameters into EvalMultimodalChunksParams and update tests accordingly; move Clippy lint exceptions to crate-level config.

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
llama-cpp-test-harness/tests/harness_self_test.rs	Removes per-file Clippy `expect` now handled via crate lint config.
llama-cpp-test-harness/Cargo.toml	Allows `unnecessary_wraps` with rationale for harness trial function signatures.
llama-cpp-bindings/src/tool_call_format/paired_quote_args.rs	Removes test-module Clippy `expect`; relies on crate lint config.
llama-cpp-bindings/src/send_logs_to_log.rs	Makes “deny panic/unwrap/…” conditional on `not(test)`; removes test-level suppression.
llama-cpp-bindings/src/sampled_token_classifier.rs	Changes multimodal eval API to accept `EvalMultimodalChunksParams`.
llama-cpp-bindings/src/mtmd/mtmd_input_chunk.rs	Updates MTMD image batch-size mismatch construction to avoid lossy casts.
llama-cpp-bindings/src/mtmd/mtmd_bitmap.rs	Adapts to upstream MTMD bitmap wrapper return type and new init flag; updates audio test fixture generation.
llama-cpp-bindings/src/mtmd/image_chunk_batch_size_mismatch.rs	Changes public mismatch field types (u32→usize/i32).
llama-cpp-bindings/src/model.rs	Adds cached chat parser handle + FFI status mapping; adds `enable_thinking` arg to `apply_chat_template`.
llama-cpp-bindings/src/llama_batch.rs	Removes Clippy suppression by using pointer casting helpers for `llama_batch_get_one`.
llama-cpp-bindings/src/lib.rs	Exposes new `eval_multimodal_chunks_params` module/type.
llama-cpp-bindings/src/eval_multimodal_chunks_params.rs	New params struct for multimodal evaluation calls.
llama-cpp-bindings/src/context/params.rs	Makes `LlamaContextParams` `Copy` and removes a Clippy suppression.
llama-cpp-bindings/src/context.rs	Removes per-fn Clippy suppression now that params are `Copy` and used by value.
llama-cpp-bindings/Cargo.toml	Allows `literal_string_with_formatting_args` crate-wide for tool-call fixtures.
llama-cpp-bindings-tests/tests/vocabulary_and_metadata.rs	Removes file-level Clippy `expect`; renames variables for clarity.
llama-cpp-bindings-tests/tests/sampling_and_constrained_decoding.rs	Updates throughput computation and `apply_chat_template` call signature.
llama-cpp-bindings-tests/tests/reasoning_markers_and_tool_calls.rs	Refactors a large test into helpers; updates `apply_chat_template` calls.
llama-cpp-bindings-tests/tests/multimodal_vision.rs	Updates to `EvalMultimodalChunksParams`-based multimodal eval API.
llama-cpp-bindings-tests/tests/multimodal_image_and_audio.rs	Adds helper to load fixture bitmaps; updates multimodal eval and expected output assertion.
llama-cpp-bindings-tests/tests/multimodal_audio.rs	Updates `apply_chat_template` and multimodal eval calls to new signatures.
llama-cpp-bindings-tests/tests/model_loading_errors.rs	Removes file-level Clippy `expect`.
llama-cpp-bindings-tests/tests/kv_cache_and_session.rs	Removes file-level Clippy `expect`.
llama-cpp-bindings-tests/tests/embedding_and_encoder.rs	Updates throughput calculation to avoid precision-loss lint.
llama-cpp-bindings-tests/tests/chat_template_and_message_parsing.rs	Updates `apply_chat_template` calls to include `enable_thinking`.
llama-cpp-bindings-tests/tests/backend_initialization.rs	Removes file-level Clippy `expect`.
llama-cpp-bindings-tests/src/build_user_prompt_with_media_marker.rs	Updates `apply_chat_template` call to include `enable_thinking`.
llama-cpp-bindings-tests/Cargo.toml	Reorders lint entries and allows `unnecessary_wraps` for harness-style trial fns.
llama-cpp-bindings-sys/wrapper_mtmd.cpp	Adapts to upstream `mtmd_helper_bitmap_init_from_file(..., false)` wrapper return type.
llama-cpp-bindings-sys/wrapper_chat_parse.h	Introduces chat parser handle API (`create/free`) and updates parse entrypoint to accept the parser handle.
llama-cpp-bindings-sys/wrapper_chat_parse.cpp	Implements chat parser caching primitives and PEG-native parsing flow.
llama-cpp-bindings-sys/wrapper_chat_apply.h	Extends chat-template application FFI with `enable_thinking`.
llama-cpp-bindings-sys/wrapper_chat_apply.cpp	Sets `inputs.enable_thinking` when applying chat templates.
llama-cpp-bindings-build/src/cmake_config.rs	Disables upstream app build; replaces runtime feature assertion with `compile_error!`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

… findings

…ser creation leak

…ting

Copilot

Pull request overview

Copilot reviewed 61 out of 61 changed files in this pull request and generated no new comments.

Upgrade vendored llama.cpp to b9585 with per-model chat-parser cache …

6762952

…and clippy cleanup

mcharytoniuk requested a review from Copilot June 10, 2026 17:21

Copilot started reviewing on behalf of mcharytoniuk June 10, 2026 17:22 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

Comment thread llama-cpp-bindings-sys/wrapper_chat_parse.cpp Outdated

Comment thread llama-cpp-bindings/src/model.rs

Comment thread llama-cpp-bindings/src/sampled_token_classifier.rs

Comment thread llama-cpp-bindings/src/mtmd/image_chunk_batch_size_mismatch.rs

mcharytoniuk added 5 commits June 10, 2026 20:02

Move MarkerKind to its own module and order imports vendor-first

4c992d8

Align test-harness lint table ordering with bindings-tests

7556aad

Add cppcheck and clang-tidy linters with vendored GSL and fix all C++…

601de4f

… findings

Move the reasoning-marker probe from C++ to Rust and fix the chat-par…

5488d2a

…ser creation leak

Exclude vendored GSL headers and config-count notes from cppcheck lin…

86c37a2

…ting

mcharytoniuk requested a review from Copilot June 11, 2026 02:13

Copilot started reviewing on behalf of mcharytoniuk June 11, 2026 02:13 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

mcharytoniuk merged commit fe50283 into main Jun 11, 2026
7 checks passed

mcharytoniuk deleted the bump-llama-cpp-b9585 branch June 11, 2026 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade vendored llama.cpp to b9585#20

Upgrade vendored llama.cpp to b9585#20
mcharytoniuk merged 6 commits into
mainfrom
bump-llama-cpp-b9585

mcharytoniuk commented Jun 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mcharytoniuk commented Jun 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants