Skip to content

blog + skill: Spring AI advisor post + Phase 9 codex-loop rewrite#642

Merged
amavashev merged 7 commits into
mainfrom
blog/cycles-spring-ai-starter-advisors-walkthrough
May 14, 2026
Merged

blog + skill: Spring AI advisor post + Phase 9 codex-loop rewrite#642
amavashev merged 7 commits into
mainfrom
blog/cycles-spring-ai-starter-advisors-walkthrough

Conversation

@amavashev
Copy link
Copy Markdown
Contributor

@amavashev amavashev commented May 14, 2026

Summary

Two related changes, bundled because the skill update is the process improvement that came directly out of this post's review experience.

1. New blog post (blog/cycles-spring-ai-starter-advisors-walkthrough.md) — covers how cycles-spring-ai-starter 0.3.1 inserts the reserve-commit-release lifecycle into Spring AI's advisor chain. Complements (not duplicates) how-scalerx-wired-cycles-into-a-java-agent-runtime.md: that post is @Cycles on raw OpenAI in plain Spring Boot; this one is Spring AI-native (advisor chain, ChatClient.Builder, Flux streaming, SubjectResolver, jtokkit, tool gating, cycles.reservation_id on Micrometer traces). Times to v0.1.0 → v0.3.1 launches in the last 48h.

2. /blog skill — Phase 9 rewrite (.claude/skills/blog/SKILL.md + .agents/skills/blog/SKILL.md). The old Phase 9 said "apply external reviewer feedback precisely" — which contradicted the project's blog-process rule (feedback is input, not directive — evaluate on merit). New Phase 9:

  • Leads with the explicit evaluate / apply / modify / skip rule, with reasoning preserved in both the conversation reply and the commit message.
  • Phase 9a: codex-cli (0.130.0) as automated external reviewer via codex exec --sandbox read-only + codex exec resume --last. Documents the 0.130.0 gotcha that resume does NOT accept --sandbox or --cd — those inherit from the original session. Loop until SHIP or stylistic-only; cap at 4 rounds.
  • Phase 9b: human external reviewer fallback with the same evaluate-on-merit rule.

Coverage (blog post)

Sections:

  1. Why an advisor, not an annotation
  2. CyclesBudgetCallAdvisor — reserve / commit (on real ChatResponse.Usage) / release
  3. CyclesBudgetStreamAdvisorFlux.defer per-subscription, concatWith(Mono.defer) fail-closed commit
  4. SubjectResolver for per-tenant attribution (v0.3.0)
  5. PromptTokenEstimator via jtokkit + canonical USD_MICROCENTS math (preempts the v0.3.0 docs 10x bug)
  6. Tool gating via CyclesToolGate.wrap (opt-in by design)
  7. cycles.reservation_id on Micrometer observations (v0.3.0)

Review cycles completed (blog post)

  • Internal Cycles 1–3 — link audit, fact-check (18/18 claims verified against runcycles/cycles-spring-ai-starter README + v0.3.1/v0.3.0/v0.2.0 release notes), SEO, full re-read, scorecard at 9.5/10. Commits 240f07d (initial draft after internal cycles).
  • Codex external reviewer — 3 rounds with apply/modify/skip evaluation each round.
    • Round 1 (commit 688a74f): 16 findings + 2 open questions → 13 applied / 2 modified / 1 skipped / 2 deferred. See #issuecomment-4450047730 for the full tally.
    • Round 2 (commit 9677b4d): 3 findings → all applied. Codex accepted all round-1 deferred/skipped items without re-litigation.
    • Round 3: SHIP — no new findings.

Test plan

  • Render preview locally (npm run dev) — verify post lands at /blog/cycles-spring-ai-starter-advisors-walkthrough
  • Confirm <title> ≤ 60 chars (50-char frontmatter + — Cycles)
  • Confirm description meta is 154 chars
  • Verify the rewritten /blog skill Phase 9 in .claude/skills/blog/SKILL.md is discovered (the in-repo copy is what the running session uses)
  • Optional second human-reviewer pass

amavashev added 3 commits May 14, 2026 06:33
Walks through how cycles-spring-ai-starter 0.3.1 inserts the
reserve-commit-release lifecycle into Spring AI's advisor chain:
ChatClientCustomizer wiring at HIGHEST_PRECEDENCE+100, Flux.defer
streaming with fail-closed commit in concatWith, SubjectResolver
for per-tenant attribution (v0.3.0), jtokkit-backed estimator with
canonical USD_MICROCENTS math, opt-in CyclesToolGate.wrap for
tool callbacks, and cycles.reservation_id on Micrometer
observations for trace↔reservation correlation.

Complements (not duplicates) how-scalerx-wired-cycles-into-a-java-
agent-runtime.md — that post covers @cycles on raw OpenAI in plain
Spring Boot; this one is Spring AI-native at the advisor-chain
layer. All claims fact-checked against runcycles/cycles-spring-ai-
starter README and v0.3.1/v0.3.0/v0.2.0 release notes.
Applied per blog-process rule (evaluate on merit, not auto-apply).
13 applied / 2 modified / 1 scope-skip / 1 deferred to round 2.

Factual:
- Lifecycle: three wire calls (reserve + commit-or-release) + one
  delegation, not four wire calls
- Drop "sees every retry" claim; clarify spring.ai.retry wraps the
  ChatModel below the advisor, so advisor sees one logical call
- Clarify chatClient.prompt(...).stream() returns a stream-spec;
  .chatResponse() yields Flux<ChatResponse>
- Soften "4x low" for CJK to "several times low"

Code:
- Mark streaming pseudocode as illustrative; note element-type
  adaptation on concatWith elided
- Note that release-on-commit-failure trades cost-side accuracy
  for clean reservation-state invariant
- MethodToolCallback example: explicit "illustrative — needs
  reflected Method, toolObject(...)" comment

Clarity:
- Fix inverted "trade explicitness for silent under-billing" — now
  "choose explicit startup signal over silent under-billing"
- End-to-end: qualify "reservation IDs appear" to require explicit
  convention attachment + emit-reservation-id-on-trace enabled
- End-to-end: scope "no call-site code changes" to chat calls;
  tool gating and convention attachment touch wiring

Overclaim/tone:
- Tool fixed-price assumption: qualify "approximated with one
  number" and note variable-cost APIs need a future extension
- "where Spring AI itself wants it" → "at the framework's own
  advisor extension point"
- "since this bit a release" → factual reference to v0.3.1 patch

Skipped:
- ChatResponse.Usage typing — upstream starter README uses this
  shorthand; staying consistent with source-of-truth wording

Deferred to round 2:
- ALLOW_WITH_CAPS scope question (out of post scope)
- Spring AI tool auto-decoration claim verification

Title 50/51, description 154/160 unchanged.
3 findings, all applied:

- Factual: drop "retry" from the list of advisor-chain
  responsibilities — round 1 already moved retry to the model
  layer below the advisor; leaving it in the chain list was a
  contradiction
- Overclaim: drop "small amount of" from the commit-failure
  release tradeoff — for a long streamed response, the
  uncommitted cost can be the whole cost
- Code: replace null placeholders in MethodToolCallback example
  with named parameters (Method reflectedMethod, WeatherService
  weatherTarget) so the snippet is illustrative, not a copy-and-
  break

Round 1 deferred items were accepted by codex without
re-litigation: ChatResponse.Usage shorthand stands (matches
upstream README wording), ALLOW_WITH_CAPS scope cut stands,
Spring AI tool auto-decoration claim stands.

Title 50/51, description 154/160 unchanged.
@amavashev
Copy link
Copy Markdown
Contributor Author

Codex review loop completed — 3 rounds, SHIP verdict

Ran codex exec + codex exec resume --last (codex-cli 0.130.0, --sandbox read-only) as an external technical reviewer over this PR. Three rounds, each with apply/modify/skip evaluation per the blog-process rule (codex output = feedback, not directive). Commits 688a74f and 9677b4d apply round-1 and round-2 fixes.

Round 1 — 16 findings + 2 open questions

Applied 13:

  • Lifecycle is three wire calls (reserve + commit or release) + one delegation, not four
  • Spring AI's spring.ai.retry.* wraps the ChatModel below the advisor — advisor sees one logical call regardless of model retries
  • chatClient.prompt(...).stream() returns a stream-spec; .chatResponse() yields the Flux<ChatResponse>
  • "4x low" CJK estimate softened to "several times low"
  • Tool fixed-price assumption qualified; variable-cost downstream APIs need a future extension point
  • "where Spring AI itself wants it" → "at the framework's own advisor extension point"
  • Fixed inverted "trade explicitness for silent under-billing" — now "choose explicit startup signal over silent under-billing"
  • End-to-end qualified: reservation IDs require convention attachment + emit-reservation-id-on-trace enabled
  • "No call-site code changes" scoped to chat calls only
  • Ending claim of "one application.yml block" softened
  • Streaming pseudocode labeled illustrative with element-type adaptation noted
  • "Since this bit a release" → factual reference to v0.3.1 patch

Modified 2:

  • Release-on-commit-failure path kept (correct state hygiene) with explicit tradeoff note added
  • MethodToolCallback builder marked illustrative with explicit "real builder needs..." comment

Skipped 1:

  • ChatResponse.Usage typing kept as shorthand — upstream README and v0.2.0 release notes use the same shorthand; staying consistent with source-of-truth wording

Deferred to round 2:

  • ALLOW_WITH_CAPS scope question (out of post scope — covered by ai-agent-action-control-hard-limits-side-effects.md)
  • Spring AI tool auto-decoration verification (verified — ToolCallbackResolver is lookup, not chain decoration; starter README claim stands)

Round 2 — 3 findings, all applied

  • Dropped "retry" from list of advisor-chain responsibilities (round 1 moved retry to model layer — leaving it in this list was a contradiction)
  • Dropped "small amount of" from commit-failure tradeoff — for a long streamed response, uncommitted cost can be the whole cost
  • Replaced null placeholders in MethodToolCallback example with named bean parameters (Method reflectedMethod, WeatherService weatherTarget) + toolObject(...) — no longer a copy-and-break trap

Codex accepted all 3 round-1 deferred/skipped items without re-litigation.

Round 3 — SHIP

No new findings. Codex returned SHIP as the overall verdict.

Title 50/51, description 154/160 throughout. Ready for human review.

The skill's Phase 9 said "apply external reviewer feedback
precisely." That contradicted the project's blog-process rule
(feedback is input, not directive — evaluate on merit, push back
when warranted) and over-softened several spec-backed claims in
past reviewer rounds.

Phase 9 is now split:

- Lead: explicit evaluate / apply / modify / skip rule with a
  one-line reason required before touching the file, preserved
  in both the conversation reply and the commit message.
- Phase 9a: codex-cli (0.130.0) as automated external reviewer
  via `codex exec --sandbox read-only` + `codex exec resume
  --last`. Captures the 0.130.0 gotcha that resume does NOT
  accept --sandbox or --cd — those inherit from the original
  session and passing them errors out. Loop until SHIP or
  stylistic-only; cap at 4 rounds.
- Phase 9b: human external reviewer fallback (the original
  user-relay flow) with the same evaluate-on-merit rule.

Validated against #642 (Spring AI advisor post):
3 codex rounds, 19 findings, 18 applied/modified, 1 scope-skipped
with reason, codex converged to SHIP in round 3.

Both skill copies (.claude/skills/blog and .agents/skills/blog)
updated identically.
@amavashev amavashev changed the title blog: Pre-Call Budget Reservation as a Spring AI Advisor blog + skill: Spring AI advisor post + Phase 9 codex-loop rewrite May 14, 2026
amavashev added 3 commits May 14, 2026 07:02
…isor post

Verified both findings against
runcycles/cycles-spring-ai-starter/.../advisor/CyclesBudgetStreamAdvisor.java.
Both correct; both applied.

Finding 1 (medium): streaming commit-failure release path

Reviewer was right. The actual source attaches doOnError to the
upstream Flux BEFORE concatWith adds the commit Mono. In Reactor,
that means doOnError observes upstream terminal signals only —
commit-Mono errors propagate to the subscriber as onError but do
NOT re-trigger the upstream's doOnError. The source's own javadoc
confirms this scope: "commit failures in fail-closed mode
propagate as onError to subscribers correctly" — with no mention
of release on commit failure.

So the post's earlier claim "the doOnError path then releases the
reservation rather than leaving it stranded" was wrong, and the
"clean reservation-state invariant" framing was wrong. On commit
failure the reservation is NOT explicitly released; cleanup relies
on the server's reservation TTL expiry. The pseudocode also had
the wrong operator order (doOnError after concatWith) which
contradicted the actual source.

Fix:
- Reorder pseudocode to match source: doOnNext → doOnError →
  doOnCancel → concatWith(commitMono)
- Rename concatWith arg to commitThenEmptyOrError(...) to make
  the failure path visible
- Rewrite the prose to: (a) note doOnError observes upstream
  errors only, (b) say commit-failure cleanup relies on
  server-side TTL expiry, (c) frame the tradeoff honestly —
  fail-closed on subscriber signal, deliberate non-handling of
  commit-failure release

Finding 2 (low): "tool that internally calls an LLM" overclaim

Reviewer was right. "Goes through the chat advisor" is only true
if the tool uses the auto-configured ChatClient. Tools that use a
raw provider SDK or build a custom ChatClient.Builder without the
starter's ChatClientCustomizer bypass the advisor entirely.

Fix: qualify the claim to "via the auto-configured ChatClient" and
add an explicit note that bypassing tools (raw SDK / custom
builder) get neither the tool-gate commit nor the chat-advisor
reservation, leaving their LLM cost invisible to Cycles.

Title 50/51, description 154/160 unchanged.
Two changes to .claude/skills/blog/SKILL.md and
.agents/skills/blog/SKILL.md (kept in sync):

1. Phase 4 (Claude internal Cycle 1) — split the fact-check step
   into a text-claim fact-check and a separate source-code audit
   step. The source-code audit fetches the actual upstream files
   (gh api / base64 -d) and verifies operator order in
   reactive/async code, method signatures, error/release paths,
   fluent-builder requirements, and any quoted identifier. Calls
   out explicitly that the post's own pseudocode is NOT ground
   truth — it is the thing being audited.

2. Phase 9a (codex external reviewer prompt template) — expanded
   the prompt skeleton to require codex to name the upstream
   source repos and fetch the relevant source files BEFORE
   judging code-level claims. Same verification list as Phase 4.
   Added a "why this is mandatory" paragraph citing the PR #642
   miss (Reactor doOnError/concatWith operator-order bug that
   shipped through 3 codex rounds + Claude cycles 1-3 before a
   sibling codex session with a broader prompt caught it).

Rationale: prose-only audits and README cross-checks miss bugs
where the prose matches the README's surface description but
contradicts the actual source code. The fix is to make
source-fetching mandatory in the prompts.

No app code or blog content changed by this commit.
…oring

Three changes to .claude/skills/blog/SKILL.md and
.agents/skills/blog/SKILL.md (kept in sync):

1. Add a top-level "Review goal" section listing the eight
   quality dimensions (factual, credibility, cross-links, SEO,
   code accuracy, structure & flow, terminology, tone & style)
   that EVERY review pass — Claude internal, codex, and human —
   must cover. No dimension is "owned" by a single phase. Closes
   the gap where a clean factual review could ship past a tone
   or SEO issue.

2. Phase 4 (Cycle 1) expanded from 4 parallel agents to 5: added
   an explicit "Style / tone / terminology audit" agent covering
   dims 6, 7, 8. Each agent annotated with which dimensions it
   owns, so the parallel set jointly covers all eight.

3. Phase 5 (Cycle 2) re-read checklist made explicit: flow &
   integration, consistency, softening of absolutes, filler
   removal — split into four numbered checks instead of one
   prose line.

4. Phase 9a (codex prompt template) now requires:
   - Comprehensive coverage of all eight dimensions, with output
     bucketed by FACTUAL / OVERCLAIM / CROSS-LINKS / SEO / CODE
     / STRUCTURE / TERMINOLOGY / TONE / OPEN QUESTIONS — one
     bucket per dimension, NONE allowed if clean. Forces codex
     to address each dimension rather than picking favorites.
   - Source-code fetching (unchanged from prior commit).

5. NEW Phase 10 "Final Scoring & Summary" (renumbers old Phase
   10 Publish → Phase 11):
   - After all reviews settle, score the final post 1-10 across
     all 8 dimensions with one-line justifications, average to a
     single overall score that must remain >= 9.0.
   - Present a single final summary to the user: title/slug/path,
     frontmatter budget status, per-dimension scorecard, overall
     score, review cycles run, notable changes summary, open
     caveats, explicit "Ready to merge?" ask.
   - Wait for user confirmation before merge / final push.

Rationale: the user explicitly asked for "after all reviews are
done, score the final post and present me with final summary."
Encoding it in the skill means future /blog runs do it
automatically rather than relying on conversation memory.
@amavashev amavashev merged commit 5e7edfe into main May 14, 2026
5 checks passed
@amavashev amavashev deleted the blog/cycles-spring-ai-starter-advisors-walkthrough branch May 14, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant