Skip to content

fix: healthchecks for ecs, snowflake and trino#112

Merged
Yash Shrivastava (alephys26) merged 3 commits into
mainfrom
alephys26/fix-healthcheck
Jun 17, 2026
Merged

fix: healthchecks for ecs, snowflake and trino#112
Yash Shrivastava (alephys26) merged 3 commits into
mainfrom
alephys26/fix-healthcheck

Conversation

@alephys26

Copy link
Copy Markdown
Contributor

Description

This pull request updates the health check logic for ECS, Snowflake, and Trino clusters, simplifying capacity checks and improving readiness detection. The main changes involve removing cluster capacity checks for ECS and Trino, and enhancing the Snowflake health check to handle suspended warehouses with auto-resume enabled.

Changes

Cluster Health Check Logic Updates:

  • Removed the ECS cluster capacity check by deleting the logic that blocked clusters at or above their maximum task count. (internal/pkg/object/command/ecs/ecs.go)
  • Removed the Trino cluster capacity check, including the entire checkTrinoClusterCapacity method and its invocation, so Trino health checks now only verify the /v1/info endpoint. (internal/pkg/object/command/trino/trino.go) [1] [2]

Snowflake Health Check Improvements:

  • Enhanced the Snowflake health check to:
    • Detect the auto_resume column index along with state.
    • Allow warehouses in the SUSPENDED state to pass the health check if auto_resume is enabled, otherwise fail with a clear error message.
    • Provide more descriptive error messages for non-ready states. (internal/pkg/object/command/snowflake/snowflake.go) [1] [2]

Copilot AI review requested due to automatic review settings June 17, 2026 11:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

The Snowflake health check reads scanned column values using brittle string assertions that can mis-detect state/auto_resume and incorrectly fail (or misreport) readiness.

Pull request overview

Updates cluster health checks to simplify readiness logic for Trino and ECS, and to improve Snowflake readiness detection for suspended warehouses.

Changes:

  • Trino: Health check now only validates the /v1/info endpoint (removes /v1/cluster capacity/worker checks).
  • ECS: Removes the max-task-count capacity gating from the ECS cluster health check.
  • Snowflake: Adds auto_resume awareness so SUSPENDED warehouses can be considered healthy when auto-resume is enabled.
File summaries
File Description
internal/pkg/object/command/trino/trino.go Removes Trino cluster-capacity/worker validation and keeps health check to /v1/info only.
internal/pkg/object/command/snowflake/snowflake.go Extends Snowflake health check to handle SUSPENDED warehouses via auto_resume.
internal/pkg/object/command/ecs/ecs.go Drops ECS cluster “at capacity” blocking logic from health checks.

Copilot's findings

  • Files reviewed: 3/3 changed files
  • Comments generated: 2

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/pkg/object/command/snowflake/snowflake.go Outdated
Comment thread internal/pkg/object/command/snowflake/snowflake.go Outdated
@alephys26 Yash Shrivastava (alephys26) merged commit 9e77476 into main Jun 17, 2026
7 checks passed
@alephys26 Yash Shrivastava (alephys26) deleted the alephys26/fix-healthcheck branch June 17, 2026 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants