Skip to content

Adopt command_handler crate, upgrade llama-cpp-bindings to 0.9.0, and remove inline imports/lint suppressions#246

Merged
mcharytoniuk merged 1 commit into
mainfrom
command-handler-and-llama-cpp-bindings-0.9.0
Jun 10, 2026
Merged

Adopt command_handler crate, upgrade llama-cpp-bindings to 0.9.0, and remove inline imports/lint suppressions#246
mcharytoniuk merged 1 commit into
mainfrom
command-handler-and-llama-cpp-bindings-0.9.0

Conversation

@mcharytoniuk

Copy link
Copy Markdown
Contributor

Adopts the published command_handler crate (Handler, shutdown signal, socket-addr value parser), upgrades to llama-cpp-bindings 0.9.0, and removes inline imports, pub use re-exports, and in-code clippy suppressions.

🤖 Generated with Claude Code

… remove inline imports and lint suppressions
Copilot AI review requested due to automatic review settings June 10, 2026 17:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the workspace by adopting the external command_handler crate for common CLI/app concerns, upgrading llama-cpp-bindings to 0.9.0, and removing a large amount of inline-import and clippy-suppression noise. It also updates GPU offload layer counts to support “all layers” semantics via -1, and improves robustness around token classification/detokenization failures in the continuous-batch scheduler.

Changes:

  • Adopt command_handler for the Handler trait, shutdown-signal handling, and socket-addr parsing; remove local equivalents.
  • Upgrade llama-cpp-bindings* to =0.9.0 and adjust call sites (e.g., validator_for, MTMD marker, sampler error handling).
  • Refactor/import cleanup across tests and crates, plus introduce a new terminal error token (DetokenizationFailed) that is treated as “done”.

Reviewed changes

Copilot reviewed 111 out of 112 changed files in this pull request and generated no comments.

Show a summary per file
File Description
paddler_tests/tests/qwen35_internal_endpoint_with_thinking_enabled_emits_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen35_internal_endpoint_with_thinking_disabled_emits_only_content_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen35_internal_endpoint_emits_tool_call_parsed_event.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/qwen35_internal_endpoint_emits_reasoning_tokens_for_image_request.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_responses_non_streaming_returns_text_and_usage.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_streaming_usage_breakdown_with_thinking.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_streaming_emits_usage_when_requested.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_streaming_emits_tool_calls_for_function_tool.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_non_streaming_usage_with_tool_calls.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_non_streaming_returns_usage.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_openai_non_streaming_emits_tool_calls_for_function_tool.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_internal_endpoint_with_thinking_enabled_emits_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_internal_endpoint_with_thinking_disabled_emits_no_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_internal_endpoint_tools_without_parse_flag_emit_only_raw_tokens.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/qwen3_internal_endpoint_pure_content_usage.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_internal_endpoint_max_tokens_usage_matches.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/qwen3_internal_endpoint_emits_tool_call_tokens.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/qwen3_internal_endpoint_emits_tool_call_parsed_event.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/qwen3_internal_endpoint_concurrent_requests_independent_usage.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/mistral3_internal_endpoint_emits_tool_call_parsed_event.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/mistral3_internal_endpoint_emits_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/mistral3_internal_endpoint_emits_reasoning_tokens_for_image_request.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/gemma4_internal_endpoint_emits_tool_call_parsed_event.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/gemma4_internal_endpoint_emits_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/gemma4_internal_endpoint_emits_reasoning_tokens_for_image_request.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/deepseek_r1_distill_llama_8b_internal_endpoint_emits_reasoning_tokens.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/continuous_batch_stops_generation_when_stop_sender_dropped.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/continuous_batch_releases_slots_on_shutdown_with_active_request.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/continuous_batch_rejects_embedding_during_active_generation.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/continuous_batch_generates_tokens_with_partial_layer_offload.rs Align partial GPU layer count type with i32.
paddler_tests/tests/agent_reports_tool_call_validation_failure.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/agent_reports_grammar_initialization_failure_for_invalid_gbnf.rs Import cleanup and simplified anyhow error construction.
paddler_tests/tests/agent_rejects_tool_with_invalid_required_field_in_schema.rs Use json! macro import and simplify.
paddler_tests/tests/agent_rejects_structurally_invalid_tool_schema.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/agent_pipeline_recognizes_duck_typed_tool_call_format_when_template_is_not_registered.rs Import cleanup; use json! and simplified anyhow errors.
paddler_tests/tests/agent_conversation_with_function_tool_succeeds.rs Use json! macro import.
paddler_tests/src/model_card/mod.rs Change GPU layer count type to i32.
paddler_test_cluster_harness/src/wait_until_healthy.rs Simplify anyhow error creation.
paddler_test_cluster_harness/src/resource_snapshot.rs Narrow std::fs imports to used items.
paddler_test_cluster_harness/src/load_test_image_data_uri.rs Narrow std::fs imports to used items.
paddler_test_cluster_harness/src/collect_generated_tokens.rs Test cleanup and simplified anyhow error creation.
paddler_test_cluster_harness/src/collect_embedding_results.rs Test cleanup and simplified anyhow error creation.
paddler_test_cluster_harness/src/buffered_requests_stream_watcher.rs Test cleanup and simplified anyhow error creation.
paddler_openai_response_format_validator/src/openai_validator.rs Use validator_for import and simplify call site.
paddler_messaging/src/inference_parameters.rs n_gpu_layers type change to i32 and document -1 semantics.
paddler_messaging/src/generated_token_result.rs Add DetokenizationFailed terminal outcome and tests.
paddler_messaging/src/embedding_normalization_method.rs Narrow std::mem import to discriminant.
paddler_gui/src/app.rs Switch shutdown-signal registration to command_handler.
paddler_gui/Cargo.toml Add command_handler dependency.
paddler_download_manager/tests/download.rs Narrow tokio::fs imports and update call sites.
paddler_download_manager/src/stream_to_partial_file.rs Narrow tokio imports in tests and update call sites.
paddler_download_manager/src/partial_file.rs Narrow tokio::fs imports and update call sites/tests.
paddler_client/src/inference_socket/spawn_read_task.rs Use debug! macro via direct import.
paddler_client/src/error.rs Remove inline clippy suppression (policy moved to workspace lints).
paddler_cli/src/main.rs Make main explicit and call run() via imports.
paddler_cli/src/lib.rs Adopt command_handler handler/shutdown integration; box Balancer subcommand.
paddler_cli/src/cmd/value_parser/parse_socket_addr.rs Delegate socket resolution to command_handler.
paddler_cli/src/cmd/mod.rs Remove local handler module export.
paddler_cli/src/cmd/handler.rs Remove local Handler trait (replaced by command_handler).
paddler_cli/src/cmd/balancer.rs Implement external Handler trait; adjust async-trait settings/signature.
paddler_cli/src/cmd/agent.rs Implement external Handler trait; adjust async-trait settings/signature.
paddler_cli/Cargo.toml Add command_handler; remove now-unused deps.
paddler_cli_tests/src/model_card/mod.rs Change GPU layer count type to i32.
paddler_cache_dir/src/lib.rs Select platform-specific cache_dir module via #[path].
paddler_cache_dir/src/cached_downloaded_model.rs Narrow tokio::fs imports to used items.
paddler_cache_dir/src/cache_dir/windows.rs Narrow std::env import and simplify test import.
paddler_cache_dir/src/cache_dir/unix.rs Narrow std::env import and simplify test import.
paddler_cache_dir/src/cache_dir/mod.rs Remove old cfg re-export shim (replaced by crate-root module selection).
paddler_bootstrap/src/shutdown_signal/windows.rs Remove internal shutdown-signal implementation (moved to command_handler).
paddler_bootstrap/src/shutdown_signal/unix.rs Remove internal shutdown-signal implementation (moved to command_handler).
paddler_bootstrap/src/shutdown_signal/mod.rs Remove internal shutdown-signal module export.
paddler_bootstrap/src/service_thread.rs Narrow std::thread import and adjust JoinHandle type usage.
paddler_bootstrap/src/lib.rs Stop exporting removed shutdown_signal module.
paddler_bootstrap/Cargo.toml Remove unix-only dev-dependency no longer needed.
paddler_balancer/src/web_admin_panel_service/http_route/static_files.rs Narrow actix-web test imports to used items.
paddler_balancer/src/web_admin_panel_service/http_route/home.rs Narrow actix-web test imports to used items.
paddler_balancer/src/web_admin_panel_service/http_route/favicon.rs Narrow actix-web test imports to used items.
paddler_balancer/src/statsd_service/mod.rs Remove inline lint suppression (policy moved to workspace lints).
paddler_balancer/src/state_database/file/mod.rs Narrow tokio::fs imports and update call sites/tests.
paddler_balancer/src/response/view_from_http_response_builder.rs Narrow std::mem usage to discriminant and adjust assertions.
paddler_balancer/src/management_service/mod.rs Make test helpers fallible rather than unwrap-based.
paddler_balancer/src/management_service/http_route/api/ws_agent_socket/mod.rs Remove inline lint suppression (policy moved to workspace lints).
paddler_balancer/src/management_service/http_route/api/put_balancer_desired_state.rs Narrow actix-web test imports to used items.
paddler_balancer/src/management_service/http_route/api/get_buffered_requests.rs Narrow actix-web test imports to used items.
paddler_balancer/src/management_service/http_route/api/get_balancer_desired_state.rs Narrow actix-web test imports to used items.
paddler_balancer/src/management_service/http_route/api/get_balancer_applicable_state.rs Narrow actix-web test imports to used items.
paddler_balancer/src/management_service/http_route/api/get_agents.rs Narrow actix-web test imports to used items.
paddler_balancer/src/inference_service/http_route/api/ws_inference_socket/mod.rs Remove inline lint suppressions; narrow actix-web test imports.
paddler_balancer/src/inference_service/http_route/api/post_generate_embedding_batch.rs Narrow actix-web test imports to used items.
paddler_balancer/src/controls_websocket_endpoint.rs Remove inline lint suppression (policy moved to workspace lints).
paddler_balancer/src/compatibility/openai_service/responses_streaming_response_transformer.rs Remove inline lint suppression (policy moved to workspace lints).
paddler_balancer/src/compatibility/openai_service/responses_response_builder.rs Use json! macro import in tests.
paddler_balancer/src/compatibility/openai_service/openai_message.rs Use json! macro import in tests.
paddler_balancer/src/compatibility/openai_service/openai_error.rs Treat DetokenizationFailed as an error token for OpenAI mapping; test import cleanup.
paddler_balancer/src/compatibility/openai_service/openai_completion_request_params.rs Use json! macro import in tests.
paddler_balancer/src/compatibility/openai_service/http_route/post_chat_completions.rs Use json! macro import in tests.
paddler_balancer/src/compatibility/openai_service/arguments_to_tool_call_string.rs Use json! macro import in tests.
paddler_balancer/src/agent_controller_pool.rs Simplify anyhow error construction.
paddler_agent/src/tool_call_validator.rs Use validator_for import; json! macro import in tests.
paddler_agent/src/tool_call_buffer.rs Narrow std::mem import to take.
paddler_agent/src/sample_token_at_batch_index.rs Add error context when applying samplers.
paddler_agent/src/prepare_conversation_history_request.rs Update MTMD marker creation to fallible API (mtmd_default_marker()?).
paddler_agent/src/model_source/url.rs Narrow tokio::fs imports in tests and update call sites.
paddler_agent/src/model_source/local.rs Narrow tokio::fs import to try_exists.
paddler_agent/src/decoded_image.rs Import narrowing and small refactors; use direct imports for image/resvg helpers.
paddler_agent/src/continuous_batch_scheduler/mod.rs Make token classifier construction fallible; handle detokenization failures as terminal outcomes.
paddler_agent/src/continuous_batch_scheduler/classify_token_phase.rs Propagate classifier ingest errors via Result.
paddler_agent/src/continuous_batch_scheduler/advance_generating_phase.rs Convert classifier errors into DetokenizationFailed completion.
paddler_agent/src/chat_template_renderer/pyjinja_tojson.rs Remove inline lint suppression (policy moved to workspace lints).
paddler_agent/src/chat_template_renderer/mod.rs Import add_to_environment directly.
Cargo.toml Add command_handler; upgrade llama-cpp-bindings*; adjust workspace clippy lint policy.
Cargo.lock Lock command_handler and upgraded llama-cpp-bindings* dependency graph.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mcharytoniuk mcharytoniuk merged commit 8d33613 into main Jun 10, 2026
13 checks passed
@mcharytoniuk mcharytoniuk deleted the command-handler-and-llama-cpp-bindings-0.9.0 branch June 10, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants