From 1eb2dfeed6b4d16904c99a160171da83c14199a0 Mon Sep 17 00:00:00 2001 From: sd Date: Wed, 13 May 2026 11:41:40 +0530 Subject: [PATCH 1/2] chore: updated bedrock docs with service tier support --- integrations/llms/bedrock/aws-bedrock.mdx | 66 ++++++++++++++++++++++- 1 file changed, 65 insertions(+), 1 deletion(-) diff --git a/integrations/llms/bedrock/aws-bedrock.mdx b/integrations/llms/bedrock/aws-bedrock.mdx index f64ca3cc..acc37be9 100644 --- a/integrations/llms/bedrock/aws-bedrock.mdx +++ b/integrations/llms/bedrock/aws-bedrock.mdx @@ -145,12 +145,14 @@ Use the Portkey instance to send requests to Anthropic. You can also override th ## Using the /messages Route with Bedrock Models -Access Bedrock's Claude models through Anthropic's native`/messages` endpoint using Portkey's SDK or Anthropic's SDK. +Access Bedrock's Claude models through Anthropic's native `/messages` endpoint using Portkey's SDK or Anthropic's SDK. This route only works with Claude models on Bedrock. For other models, use the standard OpenAI compliant endpoint. +Pass `service_tier` with Bedrock Claude models to use Bedrock service tiers. See [Bedrock Service Tiers](#bedrock-service-tiers). + ```sh @@ -828,6 +830,68 @@ If you require the model to [respond with certain fields](https://docs.aws.amazo "additionalModelResponseFieldPaths": [ "/stop_sequence" ] ``` +## Bedrock Service Tiers + +[Amazon Bedrock service tiers](https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html) let you choose how Bedrock handles inference capacity for a request. Availability depends on your model, AWS Region, account access, and AWS quotas. + +Pass `service_tier` in the request body. Portkey forwards the required Bedrock service tier format for both Chat Completions and Claude `/messages` requests. + +### Supported Values + +| Value | Notes | +| ----- | ----- | +| `default` | Uses the default Bedrock service tier. | +| `priority` | Uses priority capacity when it is available for your model and account. | +| `flex` | Uses the Bedrock flex tier when the model supports it. | +| `reserved` | Uses reserved capacity when available. | +| `auto`, `scale`, `standard_only` | Compatibility values from OpenAI or Anthropic clients. Portkey maps these to Bedrock's `default` tier. | + + +AWS determines which tiers are available for each model, Region, and account. If a tier is not enabled for your AWS environment, Bedrock may reject the request. + + +### Response Fields + +| Route | Field | Notes | +| ----- | ----- | ----- | +| Chat Completions | `service_tier` | Returns Bedrock's tier value when Bedrock includes it in the response, such as `default`, `flex`, `priority`, or `reserved`. | +| `/messages` | `usage.service_tier` | Returns `standard` or `priority`, following Anthropic's response shape. Portkey maps Bedrock `default` and `flex` to `standard`, and `priority` and `reserved` to `priority`. | + +The examples below use `priority`. Replace it with any tier supported for your model and AWS account. + + +```js NodeJS SDK +const chatCompletion = await portkey.chat.completions.create({ + messages: [{ role: "user", content: "Say this is a test" }], + model: "us.anthropic.claude-3-7-sonnet-20250219-v1:0", + max_tokens: 250, + service_tier: "priority" +}) +``` + +```python Python SDK +completion = portkey.chat.completions.create( + messages=[{ "role": "user", "content": "Say this is a test" }], + model="us.anthropic.claude-3-7-sonnet-20250219-v1:0", + max_tokens=250, + service_tier="priority" +) +``` + +```sh cURL +curl https://api.portkey.ai/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "x-portkey-api-key: $PORTKEY_API_KEY" \ + -H "x-portkey-provider: @your-bedrock-provider" \ + -d '{ + "model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0", + "max_tokens": 250, + "messages": [{"role": "user", "content": "Say this is a test"}], + "service_tier": "priority" + }' +``` + + ## Managing AWS Bedrock Prompts You can manage all prompts to AWS bedrock in the [Prompt Library](/product/prompt-library). All the current models of Anthropic are supported and you can easily start testing different prompts. From 6ca4f55b9a00aa97c3f66b760c2d098a5d529f48 Mon Sep 17 00:00:00 2001 From: sd Date: Wed, 13 May 2026 11:47:05 +0530 Subject: [PATCH 2/2] chore: fixed the model name for example --- integrations/llms/bedrock/aws-bedrock.mdx | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/integrations/llms/bedrock/aws-bedrock.mdx b/integrations/llms/bedrock/aws-bedrock.mdx index acc37be9..1145b20d 100644 --- a/integrations/llms/bedrock/aws-bedrock.mdx +++ b/integrations/llms/bedrock/aws-bedrock.mdx @@ -863,7 +863,7 @@ The examples below use `priority`. Replace it with any tier supported for your m ```js NodeJS SDK const chatCompletion = await portkey.chat.completions.create({ messages: [{ role: "user", content: "Say this is a test" }], - model: "us.anthropic.claude-3-7-sonnet-20250219-v1:0", + model: "moonshot.kimi-k2-thinking", max_tokens: 250, service_tier: "priority" }) @@ -872,7 +872,7 @@ const chatCompletion = await portkey.chat.completions.create({ ```python Python SDK completion = portkey.chat.completions.create( messages=[{ "role": "user", "content": "Say this is a test" }], - model="us.anthropic.claude-3-7-sonnet-20250219-v1:0", + model="moonshot.kimi-k2-thinking", max_tokens=250, service_tier="priority" ) @@ -884,7 +884,7 @@ curl https://api.portkey.ai/v1/chat/completions \ -H "x-portkey-api-key: $PORTKEY_API_KEY" \ -H "x-portkey-provider: @your-bedrock-provider" \ -d '{ - "model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0", + "model": "moonshot.kimi-k2-thinking", "max_tokens": 250, "messages": [{"role": "user", "content": "Say this is a test"}], "service_tier": "priority"