Skip to content

Remove manual CertReloader wiring#6946

Merged
ycombinator merged 2 commits into
elastic:mainfrom
ycombinator:remove-manual-cert-reload-wiring
May 20, 2026
Merged

Remove manual CertReloader wiring#6946
ycombinator merged 2 commits into
elastic:mainfrom
ycombinator:remove-manual-cert-reload-wiring

Conversation

@ycombinator
Copy link
Copy Markdown
Contributor

@ycombinator ycombinator commented Apr 30, 2026

What is the problem this PR solves?

PR #6838 added TLS certificate hot-reload support by manually wiring a CertReloader in server.go. That manual wiring now causes a startup failure when TLS certificates are passed as inline PEMs (rather than file paths).

The root cause: elastic-agent-libs v0.43.0 (shipped via #7054) added an IsPEMString guard to NewCertReloader that rejects inline PEM strings — correctly, since there is nothing to reload from disk. But fleet-server's server.go calls NewCertReloader unconditionally whenever CertificateReload.IsEnabled() (which is true by default), without checking whether the cert is a file path or an inline PEM. The result:

Unit state changed fleet-server-default-fleet-server (STARTING->FAILED):
Error - failed to initialize TLS cert reloader: certificate must be a file path, not an inline PEM

This broke the Kibana CI OSQuery Cypress tests (kibana-package-registry-verify-and-promote/builds/533), where fleet-server is configured with inline PEM certificates.

How does this PR solve the problem?

Removes the manual CertReloader setup block from server.Run() and the local resolvePassphrase helper. As of elastic-agent-libs v0.43.1 (#7054), LoadTLSServerConfig already handles CertReloader wiring automatically — correctly skipping it when inline PEMs are used and applying it only for file-path certificates. BuildServerConfig() then sets GetCertificate on the resulting tls.Config.

No behavioral change for file-path cert configurations: hot-reload continues to work exactly as before, as validated by Test_server_TLSCertReload.

How to test this PR locally

The existing integration test validates end-to-end cert reload:

go test ./internal/pkg/api/ -run Test_server_TLSCertReload -v

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have added tests that prove my fix is effective or that my feature works

Related issues

🤖 Generated with Claude Code

@ycombinator ycombinator requested a review from a team as a code owner April 30, 2026 23:03
@ycombinator ycombinator added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team skip-changelog labels Apr 30, 2026
@ycombinator ycombinator changed the title Remove manual CertReloader wiring from server Remove manual CertReloader wiring; adopt library-level TLS cert reload Apr 30, 2026
@ycombinator ycombinator changed the title Remove manual CertReloader wiring; adopt library-level TLS cert reload Remove manual CertReloader wiring Apr 30, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 30, 2026

This pull request does not have a backport label. Could you fix it @ycombinator? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@ycombinator ycombinator marked this pull request as draft May 7, 2026 15:06
ycombinator and others added 2 commits May 20, 2026 12:11
LoadTLSServerConfig in elastic-agent-libs now handles CertReloader
setup internally (elastic/elastic-agent-libs#417), making the manual
wiring in server.go redundant. This also removes the local
resolvePassphrase helper which is now handled by the library.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ycombinator ycombinator force-pushed the remove-manual-cert-reload-wiring branch from bc2bc53 to b8be8e8 Compare May 20, 2026 19:18
@ycombinator ycombinator marked this pull request as ready for review May 20, 2026 19:18
@ycombinator ycombinator enabled auto-merge (squash) May 20, 2026 19:55
@ycombinator ycombinator merged commit 4490fee into elastic:main May 20, 2026
10 checks passed
@ycombinator ycombinator deleted the remove-manual-cert-reload-wiring branch May 20, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-changelog Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants