You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/latest/plugins/ai-proxy.md
+14-4Lines changed: 14 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2049,19 +2049,29 @@ In the Kafka topic, you should also see a log entry corresponding to the request
2049
2049
The following example demonstrates how you can log LLM request related information in the gateway's access log to improve analytics and audit. The following variables are available:
2050
2050
2051
2051
*`request_llm_model`: LLM model name specified in the request.
2052
-
*`apisix_upstream_response_time`: Time taken for APISIX to send the request to the upstream service and receive the full response.
2053
2052
*`request_type`: Type of request, where the value could be `traditional_http`, `ai_chat`, or `ai_stream`.
2054
2053
*`llm_time_to_first_token`: Duration from request sending to the first token received from the LLM service, in milliseconds.
2055
2054
*`llm_model`: LLM model.
2056
2055
*`llm_prompt_tokens`: Number of tokens in the prompt.
2057
2056
*`llm_completion_tokens`: Number of chat completion tokens in the prompt.
2058
2057
2058
+
In addition, the following standard nginx upstream variables are automatically populated when `ai-proxy` sends requests via cosocket transport:
2059
+
2060
+
*`upstream_addr`: Address of the upstream LLM service (e.g., `api.openai.com:443`).
2061
+
*`upstream_status`: HTTP status code returned by the upstream LLM service.
2062
+
*`upstream_response_time`: Total time spent receiving the response from the upstream LLM service, in seconds (e.g., `2.858`).
2063
+
*`upstream_connect_time`: Time spent establishing the connection to the upstream LLM service, in seconds.
2064
+
*`upstream_header_time`: Time spent receiving the response headers from the upstream LLM service, in seconds.
2065
+
*`upstream_host`: Hostname of the upstream LLM service as configured in the endpoint (e.g., `api.openai.com`).
2066
+
*`upstream_scheme`: Scheme used to connect to the upstream LLM service (e.g., `https`).
2067
+
*`upstream_uri`: Request URI path sent to the upstream LLM service (e.g., `/v1/chat/completions`).
2068
+
2059
2069
Update the access log format in your configuration file to include additional LLM related variables:
The access log entry shows the request type is `ai_chat`, Apisix upstream response time is `5765` milliseconds, time to first token is `2858` milliseconds, Requested LLM model is `gpt-4`. LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
2119
+
The access log entry shows the upstream address is `api.openai.com:443` with status `200`, the request type is `ai_chat`, APISIX upstream response time is `2.858` seconds, time to first token is `2858` milliseconds, requested LLM model is `gpt-4`, LLM model is `gpt-4`, prompt token usage is `23`, and completion token usage is `8`.
0 commit comments