ct-test-srv: cap tracked submissions#8752
Conversation
When using long-running test environments, ct-test-srv grows until it OOMs, which is causing some noise in my efforts to right-size memory usage.
There was a problem hiding this comment.
Pull request overview
This PR attempts to bound memory growth in ct-test-srv by capping the number of distinct hostnames tracked for CT submission counts.
Changes:
- Adds a
submissionsCapfield andSubmissionsCapconfig option. - Moves submission-count update logic into a helper intended to enforce the cap.
- Defaults the cap to 100 when unset.
Comments suppressed due to low confidence (2)
test/ct-test-srv/main.go:175
- The helper indexes the map with
hostnames, but that identifier is only local toaddChainOrPreand is not in scope here. This causes a compile error and also ignores thehostnameargument.
_, ok := is.submissions[hostnames]
if ok || len(is.submissions) < is.submissionsCap {
is.submissions[hostnames]++
test/ct-test-srv/main.go:169
- After removing the inline increment from
addChainOrPre, this new helper is never called anywhere in the package. Even if the compile errors are fixed, successful CT submissions will no longer updateis.submissions, so/submissionswill always return 0 for new hosts.
func (is *integrationSrv) addSubmission(hostname String) {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cap := p.SubmissionsCap | ||
| if cap == 0 { | ||
| cap = 100 |
| _, ok := is.submissions[hostnames] | ||
| if ok || len(is.submissions) < is.submissionsCap { | ||
| is.submissions[hostnames]++ | ||
| } |
There was a problem hiding this comment.
that's the idea. I don't think evicting or reset is useful for integration tests
jsha
left a comment
There was a problem hiding this comment.
Looks good in general. For Boulder CI let's configure the cap to something ludicrous like 1M. If we add test cases in the future that do lots more issuance, I don't want to wind up with a mysterious flake because sometimes we hit the cap and sometimes don't.
|
Let's just set the default to 1M then. I asked Claude to do an experiment to measure usage for 1M. It says: That seems plausible enough to me. We can always lower it (even to zero) if there's any long-running test environments we care about saving memory in. For reference, the testpackage main
import (
"fmt"
"runtime"
"strings"
"testing"
)
// TestSubmissionsMapMemory measures the memory footprint of a map[string]int64
// populated with 1 million entries shaped like the keys used by ct-test-srv's
// submissions map (joined DNS names from a certificate). It prints the
// retained heap delta so we know roughly what a 1M submissionsCap would cost.
func TestSubmissionsMapMemory(t *testing.T) {
const n = 1_000_000
// Try a few hostname shapes to bracket the realistic range. Keys in the
// real server are strings.Join(cert.DNSNames, ",") so a single short name
// is the low end, while a SAN-heavy cert lands around the high end.
cases := []struct {
name string
makeKey func(i int) string
}{
{
name: "short single hostname (~20 bytes)",
makeKey: func(i int) string { return fmt.Sprintf("h%d.example.com", i) },
},
{
name: "two-SAN typical (~45 bytes)",
makeKey: func(i int) string {
return fmt.Sprintf("h%d.example.com,www.h%d.example.com", i, i)
},
},
{
name: "SAN-heavy (~250 bytes, 10 names)",
makeKey: func(i int) string {
parts := make([]string, 10)
for j := range parts {
parts[j] = fmt.Sprintf("h%d-%d.example.com", i, j)
}
return strings.Join(parts, ",")
},
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
runtime.GC()
var before runtime.MemStats
runtime.ReadMemStats(&before)
m := make(map[string]int64, n)
var keyBytes uint64
for i := 0; i < n; i++ {
k := tc.makeKey(i)
keyBytes += uint64(len(k))
m[k]++
}
runtime.GC()
var after runtime.MemStats
runtime.ReadMemStats(&after)
// Keep m reachable past ReadMemStats so it can't be GC'd early.
if len(m) != n {
t.Fatalf("unexpected map size %d", len(m))
}
heapDelta := after.HeapAlloc - before.HeapAlloc
perEntry := float64(heapDelta) / float64(n)
avgKey := float64(keyBytes) / float64(n)
t.Logf("entries=%d avg_key_len=%.1fB heap_delta=%.1f MiB (%.1f bytes/entry, key bytes alone = %.1fB/entry)",
n, avgKey, float64(heapDelta)/(1024*1024), perEntry, avgKey)
})
}
} |
When using long-running test environments, ct-test-srv grows until it OOMs, which is causing some noise in my efforts to right-size memory usage.