fix: record git author identity and authored timestamp in PR attestations across providers (server#5479)#957
Conversation
The types.Commit Go field was named Committer but tagged json:"author", the naming mismatch at the root of the author/committer conflation in PR attestations. Rename the field (and CommitterUsername -> AuthorUsername) so the Go name matches the wire contract. Pure rename: the json tags are unchanged, so wire output is identical and behaviour is preserved across all four providers. Each provider's data source (which currently still feeds committer data into some of these fields) is corrected in follow-up commits, now visible at the construction sites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…479)
The GitHub GraphQL queries only fetched each commit's committer, and
buildPREvidence mapped that into the author field. For web-flow commits
(applied suggestions, bot commits) the committer is "GitHub
<noreply@github.com>", so the real author was lost and author_username
dropped (the committer has no associated GitHub user).
Add author { name email user { login } } to the shared commit node (which
both PR-by-number and PR-by-commit queries use) and map it into Author /
AuthorUsername. Adds a red-first unit test on buildPREvidence with a
web-flow commit where author != committer.
GitLab and Azure populate the same fields from committer data via separate
code paths and are corrected in follow-up commits.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Claude finished @AlexKantor87's task in 1m 54s —— View job Review: PR #957 — record git author identity & authored timestamp
Solid, well-scoped change. The refactor-to-pure-function-then-fix approach makes the per-provider mapping unit-testable without live APIs, and the regression tests are specific (web-flow fixture with author ≠ committer, authored ≠ committed date). Build and Correctness — looks right
Minor observations (non-blocking)
Already-resolved threadsThe earlier review points (commit-timestamp consistency, dead Note for reviewers / changelogAs the PR description calls out: the wire schema is unchanged but recorded values change (author identity + authored timestamps; Azure loses I couldn't execute the unit tests in this sandbox (no exec permission), but they're pure functions with no network dependency and CI covers them. Overall: clean, tested, conventional-commits-compliant, and the slicing discipline is exemplary. 👍 |
Snyk Code's NoHardcodedCredentials rule flags the GitHub username in build_pr_evidence_test.go (a graphql Login field set to "tooky") as a hardcoded credential. It is a public identifier in test data, not a secret. Add the test file to the existing .snyk exclude.global list, matching how the repo already suppresses Snyk Code false positives. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…479) Address review feedback on cli#957. Add TODO(server#5479) markers at the known-deferred spots so the follow-up slices are easy to find: - GitLab commit author still sourced from CommitterName/CommitterEmail - Azure AuthorUsername still sourced from the committer - GitHub commit timestamp uses the committer date; author-date alignment (authoredDate) is tracked as a separate cross-provider slice No behaviour change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Pure extraction of the GitLab commit mapping so it can be unit-tested directly. Behaviour unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…479) GetMergeRequestCommits mapped CommitterName/CommitterEmail into the author field. Map AuthorName/AuthorEmail instead. Red-first test with an author != committer fixture. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Pure extraction of the Azure commit mapping for direct unit testing. Behaviour unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Azure recorded the author display name only and populated author_username from the committer. Record the author as "Name <email>" (consistent with the other providers) and drop author_username: Azure commits carry no login, so the only honest options are the committer name (wrong) or nothing. Red-first test asserts the format and the dropped field. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fetch authoredDate in graphqlCommitNode and use it for the commit timestamp, falling back to committedDate when absent. Matches the recorded author identity; for rebased/applied-suggestion commits the two dates differ. Red-first test plus a fallback test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Timestamp now sourced from commit.Author.Date rather than the committer date, matching the recorded author identity. Red-first test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Prefer AuthoredDate, falling back to CreatedAt when nil. Matches the recorded author identity. Red-first test plus a fallback test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y (server#5479) buildPREvidence reads only Author/AuthoredDate (CommittedDate as fallback), so the Committer block in graphqlCommitNode was fetched on every PR query and never consumed. Remove it. The deferred committer-capture work will re-add what it needs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…#5479) Follow-up to dropping the Committer sub-selection: the regression test no longer sets node.Commit.Committer (the field is gone). The query no longer fetches the committer, so the test sets only the author. Fixes the build broken by faee235. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
tooky
left a comment
There was a problem hiding this comment.
I think this is fine, thanks.
I was surprised to find we were constructing "Person Name person@example.com" strings. You'd think that's what's in the commit... so that's already been parsed out and we're undoing it...
But another one for another day I suppose.
What and why
Fixes kosli-dev/server#5479:
kosli attest prrecorded the git committer in the commitauthorfield, and the commit timestamp from the committer/commit date. For commits made through a web flow (applied review suggestions, bot commits, rebases) the committer differs from the author, so the real author was lost — e.g. an agent-authored commit committed via GitHub web flow attested asGitHub <noreply@github.com>with noauthor_username.This PR records the git author's identity and authored timestamp across all providers. It consolidates the former #958 (folded into this branch so the whole change is one reviewable unit); plan: https://github.com/kosli-dev/server/issues/5479#issuecomment-4690357750
Per-provider before → after
authorauthor_usernametimestampauthor { name email user { login } })AuthorName/AuthorEmail)Name <email>author.raw) — unchangeddatefield — unchangedThe Go field
types.Commit.Committer(taggedjson:"author") — the naming mismatch at the root of the conflation — is renamed toAuthor/AuthorUsername; wire tags are unchanged. Azure commits expose no login, soauthor_usernameis dropped (omitempty) rather than populated from the wrong source.How it's built
Each provider's commit mapping is a pure function (
buildPREvidencefor GitHub,commitFrom<Provider>Commitfor GitLab/Azure) so the logic is unit-tested without live API calls. Refactor-then-fix per slice, every fix preceded by a failing test (web-flow fixture with author ≠ committer; authored ≠ committed date). Bitbucket verified read-only.Compatibility note (for changelog + docs)
Wire schema unchanged; values change. Policies (Rego or jq) reading commit
authorortimestampmay legitimately flip outcomes — never-alone / four-eyes checks comparing commit author to approvers, and any commit-time-window logic. Azure payloads loseauthor_username(previously the committer's display name).Verification
go build ./...,go vet,make lint(0 issues), and unit tests for all changed packages pass. The full integration suite (test / Test) and the multi-LLM review passed on the consolidated change in #958 before the fold; CI re-runs here on the combined branch.🤖 Generated with Claude Code