Skip to content

Upgrade vendored llama.cpp to b9585#20

Merged
mcharytoniuk merged 6 commits into
mainfrom
bump-llama-cpp-b9585
Jun 11, 2026
Merged

Upgrade vendored llama.cpp to b9585#20
mcharytoniuk merged 6 commits into
mainfrom
bump-llama-cpp-b9585

Conversation

@mcharytoniuk

Copy link
Copy Markdown

Upgrades the vendored llama.cpp submodule to the official b9585 tag, adds a per-model chat-parser cache, and removes all clippy allow/expect suppressions.

🤖 Generated with Claude Code

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the vendored llama.cpp integration and adapts the Rust bindings/tests to match the new upstream APIs. It introduces a per-LlamaModel cached chat parser handle (to avoid repeated template analysis), updates chat-template application to support an enable_thinking flag, and replaces a multi-argument multimodal evaluation call with a parameter struct. It also centralizes several Clippy suppressions into per-crate Cargo.toml lint config.

Changes:

  • Upgrade the vendored llama.cpp interface and adjust C++ wrappers (chat parsing, chat template application, MTMD bitmap init, and build defines).
  • Add a per-model cached chat parser in LlamaModel and update parsing/apply APIs (including an enable_thinking switch).
  • Refactor multimodal chunk evaluation parameters into EvalMultimodalChunksParams and update tests accordingly; move Clippy lint exceptions to crate-level config.

Reviewed changes

Copilot reviewed 35 out of 35 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
llama-cpp-test-harness/tests/harness_self_test.rs Removes per-file Clippy expect now handled via crate lint config.
llama-cpp-test-harness/Cargo.toml Allows unnecessary_wraps with rationale for harness trial function signatures.
llama-cpp-bindings/src/tool_call_format/paired_quote_args.rs Removes test-module Clippy expect; relies on crate lint config.
llama-cpp-bindings/src/send_logs_to_log.rs Makes “deny panic/unwrap/…” conditional on not(test); removes test-level suppression.
llama-cpp-bindings/src/sampled_token_classifier.rs Changes multimodal eval API to accept EvalMultimodalChunksParams.
llama-cpp-bindings/src/mtmd/mtmd_input_chunk.rs Updates MTMD image batch-size mismatch construction to avoid lossy casts.
llama-cpp-bindings/src/mtmd/mtmd_bitmap.rs Adapts to upstream MTMD bitmap wrapper return type and new init flag; updates audio test fixture generation.
llama-cpp-bindings/src/mtmd/image_chunk_batch_size_mismatch.rs Changes public mismatch field types (u32→usize/i32).
llama-cpp-bindings/src/model.rs Adds cached chat parser handle + FFI status mapping; adds enable_thinking arg to apply_chat_template.
llama-cpp-bindings/src/llama_batch.rs Removes Clippy suppression by using pointer casting helpers for llama_batch_get_one.
llama-cpp-bindings/src/lib.rs Exposes new eval_multimodal_chunks_params module/type.
llama-cpp-bindings/src/eval_multimodal_chunks_params.rs New params struct for multimodal evaluation calls.
llama-cpp-bindings/src/context/params.rs Makes LlamaContextParams Copy and removes a Clippy suppression.
llama-cpp-bindings/src/context.rs Removes per-fn Clippy suppression now that params are Copy and used by value.
llama-cpp-bindings/Cargo.toml Allows literal_string_with_formatting_args crate-wide for tool-call fixtures.
llama-cpp-bindings-tests/tests/vocabulary_and_metadata.rs Removes file-level Clippy expect; renames variables for clarity.
llama-cpp-bindings-tests/tests/sampling_and_constrained_decoding.rs Updates throughput computation and apply_chat_template call signature.
llama-cpp-bindings-tests/tests/reasoning_markers_and_tool_calls.rs Refactors a large test into helpers; updates apply_chat_template calls.
llama-cpp-bindings-tests/tests/multimodal_vision.rs Updates to EvalMultimodalChunksParams-based multimodal eval API.
llama-cpp-bindings-tests/tests/multimodal_image_and_audio.rs Adds helper to load fixture bitmaps; updates multimodal eval and expected output assertion.
llama-cpp-bindings-tests/tests/multimodal_audio.rs Updates apply_chat_template and multimodal eval calls to new signatures.
llama-cpp-bindings-tests/tests/model_loading_errors.rs Removes file-level Clippy expect.
llama-cpp-bindings-tests/tests/kv_cache_and_session.rs Removes file-level Clippy expect.
llama-cpp-bindings-tests/tests/embedding_and_encoder.rs Updates throughput calculation to avoid precision-loss lint.
llama-cpp-bindings-tests/tests/chat_template_and_message_parsing.rs Updates apply_chat_template calls to include enable_thinking.
llama-cpp-bindings-tests/tests/backend_initialization.rs Removes file-level Clippy expect.
llama-cpp-bindings-tests/src/build_user_prompt_with_media_marker.rs Updates apply_chat_template call to include enable_thinking.
llama-cpp-bindings-tests/Cargo.toml Reorders lint entries and allows unnecessary_wraps for harness-style trial fns.
llama-cpp-bindings-sys/wrapper_mtmd.cpp Adapts to upstream mtmd_helper_bitmap_init_from_file(..., false) wrapper return type.
llama-cpp-bindings-sys/wrapper_chat_parse.h Introduces chat parser handle API (create/free) and updates parse entrypoint to accept the parser handle.
llama-cpp-bindings-sys/wrapper_chat_parse.cpp Implements chat parser caching primitives and PEG-native parsing flow.
llama-cpp-bindings-sys/wrapper_chat_apply.h Extends chat-template application FFI with enable_thinking.
llama-cpp-bindings-sys/wrapper_chat_apply.cpp Sets inputs.enable_thinking when applying chat templates.
llama-cpp-bindings-build/src/cmake_config.rs Disables upstream app build; replaces runtime feature assertion with compile_error!.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread llama-cpp-bindings-sys/wrapper_chat_parse.cpp Outdated
Comment thread llama-cpp-bindings/src/model.rs
Comment thread llama-cpp-bindings/src/sampled_token_classifier.rs
Comment thread llama-cpp-bindings/src/mtmd/image_chunk_batch_size_mismatch.rs

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 61 out of 61 changed files in this pull request and generated no new comments.

@mcharytoniuk mcharytoniuk merged commit fe50283 into main Jun 11, 2026
7 checks passed
@mcharytoniuk mcharytoniuk deleted the bump-llama-cpp-b9585 branch June 11, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants