feat: spot-capacity diversification (cut on-demand fallback + t3 unlimited-mode burst)#90
Conversation
Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
…PPINGS Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
…fault Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
…te isFatal coupling Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
…REDITS Document the new diversified spot launch (multi-candidate types × AZs before on-demand fallback + the spot_exhausted_ondemand_fallback log), the optional instance_types list in LABEL_MAPPINGS with the new six-class default, and the EC2_CPU_CREDITS env var. Fixes the stale label_mappings=[] doc drift. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
…r-naming) Our option field triggered revive var-naming (initialisms → CPU). The AWS SDK's CreditSpecificationRequest.CpuCredits field name is unchanged (not ours to rename). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
📝 WalkthroughWalkthroughThis PR adds ordered instance-type and subnet candidate support to the scaleup flow, updates AWS and GCP launchers to consume the new contract, configures CPU-credit handling for AWS, refreshes infra defaults, and updates documentation for the new spot selection and fallback behavior. ChangesMulti-candidate spot launch support
Estimated code review effort: 4 (Complex) | ~60 minutes Sequence Diagram(s)sequenceDiagram
participant Scaleup as processRecord
participant Launcher as ec2.Launcher.Launch
participant EC2 as EC2 API
Scaleup->>Launcher: Launch(spec with InstanceTypes, SubnetIDs)
loop rotated subnets x instance types
Launcher->>EC2: RunInstances (spot, instanceType, subnetID)
EC2-->>Launcher: success or error
alt fatal error
Launcher-->>Scaleup: return error
else success
Launcher-->>Scaleup: return instance
end
end
Launcher->>EC2: RunInstances (on-demand fallback)
EC2-->>Launcher: success or error
Launcher-->>Scaleup: return instance or error
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
OpenSSF Scorecard — 8.3/10 ✅
|
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
infra/terraform/variables.tf (1)
63-63: 🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick winSame
large-labelinstance_type/instance_types[0]mismatch asinfra/cloudformation/template.yaml.
instance_typeis"c5.xlarge"whileinstance_types[0]is"c6i.xlarge"; align them so both IaC defaults stay consistent.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@infra/terraform/variables.tf` at line 63, The default for the large instance definition is inconsistent because `instance_type` and the first entry in `instance_types` do not match. Update the large object in `variables.tf` so the `instance_type` value and `instance_types[0]` align with the same instance family, matching the corresponding large entry used in `template.yaml`.
🧹 Nitpick comments (1)
lambda/internal/gcp/gce/launcher_test.go (1)
79-126: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd coverage for the new empty-
InstanceTypesguard and subnet fallback.The EC2 launcher's equivalent behavior (empty
InstanceTypes→ error, emptySubnetIDs→ fallback) is directly tested (TestLaunch_NoInstanceTypesErrors,TestLaunch_EmptySubnetsNoPanicinlambda/internal/aws/ec2/launcher_test.go), but this file only updatesdefaultSpec()'s field types without adding equivalent tests forgce.Launcher.Launch.✅ Suggested additional test cases
func TestLauncher_Launch_NoInstanceTypesErrors(t *testing.T) { fake := &fakeGCE{} launcher := newLauncherWithAPI(fake, defaultOpts()) spec := defaultSpec() spec.InstanceTypes = nil if _, err := launcher.Launch(context.Background(), spec); err == nil { t.Fatal("want error for empty InstanceTypes, got nil") } if fake.insertCalled { t.Error("expected Insert not to be called") } } func TestLauncher_Launch_EmptySubnetIDsFallsBackToOptsSubnet(t *testing.T) { fake := &fakeGCE{insertReturnID: "jit-runner-abcdef12"} launcher := newLauncherWithAPI(fake, defaultOpts()) spec := defaultSpec() spec.SubnetIDs = nil if _, err := launcher.Launch(context.Background(), spec); err != nil { t.Fatalf("unexpected error: %v", err) } got := fake.insertReq.GetInstanceResource().GetNetworkInterfaces()[0].GetSubnetwork() if got != defaultOpts().Subnet { t.Errorf("subnet = %q, want opts.Subnet %q", got, defaultOpts().Subnet) } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lambda/internal/gcp/gce/launcher_test.go` around lines 79 - 126, Add tests in gce.Launcher.Launch for the new validation and fallback behavior: verify that an empty InstanceTypes slice returns an error and does not call fakeGCE.Insert, and verify that when SubnetIDs is empty the launcher falls back to the subnet from defaultOpts() instead of panicking or using an empty value. Use the existing TestLauncher_Launch_ReturnsInstance pattern and unique symbols like defaultSpec, newLauncherWithAPI, fakeGCE, and Launch to place the new cases nearby.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/getting-started-aws.md`:
- Around line 160-164: The `LabelMappings` example in the AWS getting-started
doc is presented as a complete override, but `--parameter-overrides` replaces
the entire JSON value and would drop the default labels. Update the
`LabelMappings` guidance to either show the full default mapping set or clearly
mark it as a partial override that must be expanded before use, keeping the
explanation attached to the `LabelMappings`/`instance_types` example.
In `@docs/troubleshooting.md`:
- Around line 196-217: The fallback diagnosis section is inaccurate: the `aws
logs filter-log-events` example returns raw messages rather than a count, and
the scaleup Lambda log fields referenced in the text are not actually emitted by
the current launcher path. Update the documentation to match the real output, or
change the logging in the scaleup flow so the `spot_exhausted_ondemand_fallback`
event includes the label class and tried candidate list alongside `runner` and
`attempted_types`, using the relevant logging code in the scaleup Lambda that
emits these messages.
In `@infra/cloudformation/template.yaml`:
- Line 73: The `large` entry’s primary `instance_type` is out of sync with the
first value in its `instance_types` list, so update the `large` label in the
CloudFormation default so both fields use the same leading instance type,
matching the pattern used by the other size labels. Also make the same alignment
fix in the duplicated default in the Terraform variables definition so both
sources stay consistent.
In `@lambda/internal/aws/ec2/launcher.go`:
- Around line 119-146: The isFatal helper in launcher.go currently treats
InvalidParameterValue and InvalidParameterCombination as fatal, which can stop
launch fallback too early; change the logic in isFatal to classify those two EC2
API errors as retryable so candidate selection can continue. Keep the existing
fatal handling for the other explicit codes, and add a test alongside
TestLaunch_UnknownErrorStaysRetryable that verifies both InvalidParameterValue
and InvalidParameterCombination do not short-circuit fallback.
- Around line 62-98: The sequential spot sweep in Launcher.Launch can consume
the full Lambda timeout before the on-demand fallback is tried. Add a ctx.Err()
check inside the nested instance-type/subnet loops, before each runInstance
attempt and after failures, so Launch exits promptly when the context is near
deadline. Keep the fix localized to Launcher.Launch and preserve the existing
isFatal handling and fallback behavior.
---
Duplicate comments:
In `@infra/terraform/variables.tf`:
- Line 63: The default for the large instance definition is inconsistent because
`instance_type` and the first entry in `instance_types` do not match. Update the
large object in `variables.tf` so the `instance_type` value and
`instance_types[0]` align with the same instance family, matching the
corresponding large entry used in `template.yaml`.
---
Nitpick comments:
In `@lambda/internal/gcp/gce/launcher_test.go`:
- Around line 79-126: Add tests in gce.Launcher.Launch for the new validation
and fallback behavior: verify that an empty InstanceTypes slice returns an error
and does not call fakeGCE.Insert, and verify that when SubnetIDs is empty the
launcher falls back to the subnet from defaultOpts() instead of panicking or
using an empty value. Use the existing TestLauncher_Launch_ReturnsInstance
pattern and unique symbols like defaultSpec, newLauncherWithAPI, fakeGCE, and
Launch to place the new cases nearby.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 83bcd4da-a623-4051-9a8c-7f3da7907f6d
📒 Files selected for processing (19)
AGENTS.mdCLAUDE.mdREADME.mddocs/getting-started-aws.mddocs/getting-started-gcp.mddocs/troubleshooting.mdinfra/cloudformation/template.yamlinfra/terraform/lambda.tfinfra/terraform/variables.tflambda/cmd/scaleup/main.golambda/cmd/scaleup/main_integration_test.golambda/go.modlambda/internal/aws/ec2/launcher.golambda/internal/aws/ec2/launcher_test.golambda/internal/compute/compute.golambda/internal/config/config.golambda/internal/gcp/gce/launcher.golambda/internal/gcp/gce/launcher_test.golambda/internal/provider/aws.go
| LabelMappings: | ||
| Type: String | ||
| Default: "[]" | ||
| Default: '[{"label":"nano","instance_type":"t2.nano"},{"label":"micro","instance_type":"t2.micro"},{"label":"small","instance_type":"t2.small"},{"label":"medium","instance_type":"t3.medium","instance_types":["t3.medium","t3a.medium","m6i.large","m5.large"]},{"label":"large","instance_type":"c5.xlarge","instance_types":["c6i.xlarge","c5.xlarge","c5a.xlarge","m6i.xlarge"]},{"label":"release","instance_type":"m5.xlarge","instance_types":["m5.xlarge","m5a.xlarge","m6i.xlarge","m6a.xlarge"]}]' |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win
large label's instance_type doesn't match the head of its instance_types list.
instance_type is "c5.xlarge" but instance_types[0] is "c6i.xlarge". Every other label with an instance_types array keeps instance_type aligned with the first (primary) candidate. This looks like a copy/paste slip and is confusing regardless of which field actually drives resolution at runtime.
🔧 Suggested fix
-{"label":"large","instance_type":"c5.xlarge","instance_types":["c6i.xlarge","c5.xlarge","c5a.xlarge","m6i.xlarge"]}
+{"label":"large","instance_type":"c6i.xlarge","instance_types":["c6i.xlarge","c5.xlarge","c5a.xlarge","m6i.xlarge"]}Same mismatch is duplicated in infra/terraform/variables.tf.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Default: '[{"label":"nano","instance_type":"t2.nano"},{"label":"micro","instance_type":"t2.micro"},{"label":"small","instance_type":"t2.small"},{"label":"medium","instance_type":"t3.medium","instance_types":["t3.medium","t3a.medium","m6i.large","m5.large"]},{"label":"large","instance_type":"c5.xlarge","instance_types":["c6i.xlarge","c5.xlarge","c5a.xlarge","m6i.xlarge"]},{"label":"release","instance_type":"m5.xlarge","instance_types":["m5.xlarge","m5a.xlarge","m6i.xlarge","m6a.xlarge"]}]' | |
| Default: '[{"label":"nano","instance_type":"t2.nano"},{"label":"micro","instance_type":"t2.micro"},{"label":"small","instance_type":"t2.small"},{"label":"medium","instance_type":"t3.medium","instance_types":["t3.medium","t3a.medium","m6i.large","m5.large"]},{"label":"large","instance_type":"c6i.xlarge","instance_types":["c6i.xlarge","c5.xlarge","c5a.xlarge","m6i.xlarge"]},{"label":"release","instance_type":"m5.xlarge","instance_types":["m5.xlarge","m5a.xlarge","m6i.xlarge","m6a.xlarge"]}]' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@infra/cloudformation/template.yaml` at line 73, The `large` entry’s primary
`instance_type` is out of sync with the first value in its `instance_types`
list, so update the `large` label in the CloudFormation default so both fields
use the same leading instance type, matching the pattern used by the other size
labels. Also make the same alignment fix in the duplicated default in the
Terraform variables definition so both sources stay consistent.
| // isFatal reports whether err is a request-level error that would fail on | ||
| // on-demand too — the only case where skipping the on-demand fallback is safe. | ||
| // DEFAULT: unknown/non-API errors are retryable (return false), preserving the | ||
| // "launch success ≥ today" guarantee. | ||
| func isFatal(err error) bool { | ||
| var apiErr smithy.APIError | ||
| if !errors.As(err, &apiErr) { | ||
| return false | ||
| } | ||
| code := apiErr.ErrorCode() | ||
| if strings.HasPrefix(code, "InvalidAMIID") { | ||
| return true | ||
| } | ||
| switch code { | ||
| case "UnauthorizedOperation", "AuthFailure", "PendingVerification": | ||
| return true | ||
| // NOTE: InvalidParameterValue/Combination are treated as fatal (they fail on | ||
| // on-demand too). This is safe only while spot-specific params are hardcoded | ||
| // valid constants. If SpotMaxPrice (currently unwired) is ever populated and | ||
| // misconfigured, it surfaces as InvalidParameterValue on the SPOT request only | ||
| // — revisit excluding these codes then, so a bad MaxPrice still falls back. | ||
| case "InvalidParameterValue", "InvalidParameterCombination": | ||
| return true | ||
| } | ||
| // Allow-list is intentionally minimal and grows only as production logs | ||
| // reveal codes that also fail on-demand. Unknown ⇒ retryable (see doc above). | ||
| return false | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🌐 Web query:
EC2 RunInstances InvalidParameterValue instance type architecture mismatch AMI
💡 Result:
The error "InvalidParameterValue: The architecture... of the specified instance type does not match the architecture... of the specified AMI" occurs when you attempt to launch an EC2 instance using an Amazon Machine Image (AMI) that is not compatible with the selected instance type's processor architecture (e.g., trying to use an x86_64 AMI on an arm64 Graviton instance, or vice-versa) [1][2][3]. To resolve this issue, ensure the architectures of both the AMI and the instance type are compatible [4][2]. 1. Identify the architecture of your AMI: You can check the architecture of your selected AMI using the AWS CLI [1][2]: aws ec2 describe-images --image-ids <YOUR_AMI_ID> --query 'Images.Architecture' 2. Identify compatible instance types: Once you know the AMI's architecture (e.g., x86_64 or arm64), you must select an instance type that supports that same architecture [2][3]. You can list instance types for a specific architecture using the AWS CLI [3]: aws ec2 describe-instance-types --filters Name=processor-info.supported-architecture,Values=arm64 --query "InstanceTypes[*].InstanceType" --output text Common architectures include: - x86_64: Supported by most standard instance families (e.g., m5, c5, t3). - arm64: Supported by AWS Graviton-based instance families (e.g., m6g, c6g, t4g). If you are using Infrastructure as Code (such as Terraform or AWS CloudFormation), verify that your resource configuration defines an instance type and AMI that align [2][5][3]. If you encounter this error during a validation/dry-run phase of a tool (like AWS ParallelCluster), ensure you are using the latest supported configuration or, if appropriate, verify if the tool is incorrectly performing validations against an incompatible default AMI [6].
Citations:
- 1: https://oneuptime.com/blog/post/2026-02-12-troubleshoot-ec2-instance-launch-failures/view
- 2: https://docs.aws.amazon.com/autoscaling/ec2/userguide/ts-as-ami.html
- 3: https://stackoverflow.com/questions/67276619/how-to-find-the-list-of-applicable-instance-types-for-an-ami
- 4: The architecture 'x86_64' of the specified instance type does not match the architecture 'arm64' of the specified AMI aws/aws-parallelcluster#3795
- 5: https://stackoverflow.com/questions/75975304/terraform-aws-ec2-instance-ubuntu-error
- 6: https://github.com/aws/aws-parallelcluster/wiki/Configuration-validation-failure:-architecture-of-AMI-and-instance-type-does-not-match
🏁 Script executed:
#!/bin/bash
set -euo pipefail
git ls-files lambda/internal/aws/ec2
printf '\n--- launcher outline ---\n'
ast-grep outline lambda/internal/aws/ec2/launcher.go --view expanded || true
printf '\n--- tests outline ---\n'
ast-grep outline lambda/internal/aws/ec2/*_test.go --view expanded || true
printf '\n--- search for isFatal and launch loop ---\n'
rg -n "isFatal|InvalidParameterValue|InvalidParameterCombination|UnknownErrorStaysRetryable|divers|fallback|spot" lambda/internal/aws/ec2 -SRepository: devopsfactory-io/jit-runners
Length of output: 7670
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,260p' lambda/internal/aws/ec2/launcher.go | cat -n
printf '\n--- tests ---\n'
for f in lambda/internal/aws/ec2/*_test.go; do
echo "### $f"
sed -n '1,260p' "$f" | cat -n
doneRepository: devopsfactory-io/jit-runners
Length of output: 23890
Treat InvalidParameterValue/InvalidParameterCombination as retryable
These EC2 errors can be candidate-specific (for example AMI/type or AZ/subnet incompatibilities), so classifying them as fatal can abort the first failed combination and skip other candidates that may still launch. Demote them to retryable and add a test covering both codes, similar to TestLaunch_UnknownErrorStaysRetryable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@lambda/internal/aws/ec2/launcher.go` around lines 119 - 146, The isFatal
helper in launcher.go currently treats InvalidParameterValue and
InvalidParameterCombination as fatal, which can stop launch fallback too early;
change the logic in isFatal to classify those two EC2 API errors as retryable so
candidate selection can continue. Keep the existing fatal handling for the other
explicit codes, and add a test alongside TestLaunch_UnknownErrorStaysRetryable
that verifies both InvalidParameterValue and InvalidParameterCombination do not
short-circuit fallback.
…g, doc + config consistency, GCP guard tests Signed-off-by: Kaio Fellipe <kaio6fellipe@gmail.com>
|
Addressed all 5 review findings in
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lambda/internal/gcp/gce/launcher_test.go (1)
128-165: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueCorrect test coverage; consider table-driven style per guidelines.
Both new tests correctly assert the
Launch()contract shown inlauncher.go— "if len(spec.InstanceTypes) == 0 { return compute.Instance{}, fmt.Errorf("gcp/gce: launch: spec has no instance types") }" and the subnet fallback logic where subnetURL falls back to l.opts.Subnet when spec.SubnetIDs is empty.That said, these two independent test functions could be consolidated into a single table-driven test (varying spec mutation, expected error, and expected subnetwork) per the repo's testing guideline.
As per coding guidelines,
lambda/**/*_test.go: "Place tests in the same package as the code they test and write them in table-driven style."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lambda/internal/gcp/gce/launcher_test.go` around lines 128 - 165, The new Launcher tests cover two independent Launch() behaviors, but they should be consolidated into one table-driven test to match the repo’s testing guideline. Update the launcher_test.go coverage around newLauncherWithAPI, defaultSpec, and Launch so the empty InstanceTypes error case and the SubnetIDs fallback case are expressed as table rows with expected error/insert behavior and expected subnetwork values.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@lambda/internal/gcp/gce/launcher_test.go`:
- Around line 128-165: The new Launcher tests cover two independent Launch()
behaviors, but they should be consolidated into one table-driven test to match
the repo’s testing guideline. Update the launcher_test.go coverage around
newLauncherWithAPI, defaultSpec, and Launch so the empty InstanceTypes error
case and the SubnetIDs fallback case are expressed as table rows with expected
error/insert behavior and expected subnetwork values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bc02efdb-f738-4d66-98f6-41646c7a34d2
📒 Files selected for processing (7)
docs/getting-started-aws.mddocs/troubleshooting.mdinfra/cloudformation/template.yamlinfra/terraform/variables.tflambda/internal/aws/ec2/launcher.golambda/internal/aws/ec2/launcher_test.golambda/internal/gcp/gce/launcher_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- infra/terraform/variables.tf
- docs/getting-started-aws.md
- infra/cloudformation/template.yaml
- lambda/internal/aws/ec2/launcher_test.go
- docs/troubleshooting.md
- lambda/internal/aws/ec2/launcher.go
Summary
Cuts jit-runners' recurring AWS cost (from the June FinOps pass) by launching runners across diversified spot pools before the on-demand last resort, and pinning t-family runners to standard CPU credits. Targets the ~$22/mo on-demand fallback + ~$4.55/mo t3 unlimited-mode burst.
What changed (7 commits, TDD)
refactor(compute)— widenLaunchSpectoInstanceTypes []string/SubnetIDs []string(all readers updated: AWS + GCP launchers, scaleup, tests).feat(scaleup)—LabelMapping.InstanceTypes+resolveInstanceTypes(ordered candidate list; back-compat with singleinstance_type).feat(ec2)—Launchnow tries each candidate type × subnet (AZ) on spot, first-success-wins, then one on-demand last resort.isFatal(smithy.APIError) aborts only on a closed allow-list of request-level errors; unknown/non-API errors default to retryable so launch success is provably ≥ today. Deterministicrotatespreads concurrent launches across AZs. Emitsevent=spot_exhausted_ondemand_fallbackon fallback.feat(ec2)— t-family (t2/t3/t3a/t4g) launches getCreditSpecification=standard(gated onisBurstable), wired viaEC2_CPU_CREDITS(defaultstandard).feat(infra)— CFN + Terraform: six-classLabelMappingsdefault with candidate lists +EC2_CPU_CREDITS=standard.refactor(launcher)— GCP empty-InstanceTypesguard,rotatetest,isFatalcoupling note.docs/fix(lint)— docs for the new behavior/config; revive var-naming fix.Safety (CI availability is the hard constraint)
The "launch success ≥ today" guarantee holds under two enforced conditions: the candidate list always includes the original primary type + on-demand fallback preserved; and
isFatalis a closed allow-list defaulting unknown → retryable. The spec reviewer mutation-tested this — flipping the default to fatal makes the unknown/non-API tests fail, so the property is load-bearing.releasecandidates are all 16 GiB (no OOM). No new IAM (ec2:RunInstancessuffices).Verification
make checkgreen (lint + vet + 232 tests). New tests cover: spot-success, capacity→next-candidate, all-fail→on-demand, fatal-abort, unknown→retryable, non-API→retryable, empty-subnets, empty-types, burstable/non-burstable credits,resolveInstanceTypes,rotate.Design docs & context
Projects/jit-runners/specs|plans/2026-07-01-scaleup-spot-diversification*)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
standard).Bug Fixes
Documentation
instance_typescandidate lists and clarified AWS spot fallback behavior.