Skip to content

feat: add sensitive data filtering methods#1330

Open
giortzisg wants to merge 9 commits into
feat/data-collectionfrom
feat/data-denylist
Open

feat: add sensitive data filtering methods#1330
giortzisg wants to merge 9 commits into
feat/data-collectionfrom
feat/data-denylist

Conversation

@giortzisg

@giortzisg giortzisg commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Description

This PR adds the filtering methods for sensitive data. Using the new methods means that the SDK now will properly apply scrubbing to collected headers, cookies, query parameters and request bodies (the SDK previously collected raw bodies and query strings. The followup PR will take care of the plumbing and fixing all failing tests). SendDefaultPII also remains supported for backwards compatibility along with the fixes on raw sensitive data.

#skip-changelog

Issues

Changelog Entry Instructions

To add a custom changelog entry, uncomment the section above. Supports:

  • Single entry: just write text
  • Multiple entries: use bullet points
  • Nested bullets: indent 4+ spaces

For more details: custom changelog entries

Reminders

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown

Semver Impact of This PR

🟡 Minor (new features)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


This PR will not appear in the changelog.


🤖 This preview updates automatically when you update the PR.

@giortzisg giortzisg force-pushed the feat/data-denylist branch 4 times, most recently from 4e2524c to e8e7f1e Compare June 23, 2026 08:52
@giortzisg giortzisg changed the title add sensitive denylist feat: add sensitive data filtering methods Jun 23, 2026
@giortzisg giortzisg marked this pull request as ready for review June 23, 2026 09:03
Comment thread data_collection_filter.go Outdated
Comment thread data_collection_filter.go Outdated
@giortzisg giortzisg force-pushed the feat/data-denylist branch from e8e7f1e to aed4a07 Compare June 23, 2026 13:13
Comment thread data_collection_filter.go Outdated
@giortzisg giortzisg force-pushed the feat/data-denylist branch from aed4a07 to 0c364f9 Compare June 23, 2026 13:15
Comment thread data_collection_filter.go Outdated
Comment thread data_collection_filter.go Outdated
Comment thread data_collection_filter.go Outdated
@giortzisg giortzisg force-pushed the feat/data-denylist branch from 0c364f9 to 9f3c30a Compare June 24, 2026 07:27
Comment thread data_collection_filter.go
@giortzisg giortzisg force-pushed the feat/data-denylist branch from 9f3c30a to 389be77 Compare June 24, 2026 07:30
Comment thread data_collection_filter.go
@giortzisg giortzisg self-assigned this Jun 24, 2026
Comment thread data_collection_filter.go Outdated
Comment on lines +36 to +38
func isSensitiveKey(key string) bool {
return matchesDenyTerms(key, sensitiveDenyList)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a bit incomplete because it's always used with !matchesDenyTerms(k, behavior.Terms) or matchesDenyTerms(k, behavior.Terms) and doesn't appear to be allowed to exist on its own.

Would it make sense to have two functions, one for
isSensitiveKey(k) || !matchesDenyTerms(k, behavior.Terms) and one with isSensitiveKey(k) || matchesDenyTerms(k, behavior.Terms) and give them descriptive names?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of separation this probably makes more sense and would be easier to distinguish, so let me change that.

Comment thread data_collection_filter.go Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit f0b976d. Configure here.

Comment thread data_collection.go Outdated
Comment thread data_collection_filter.go Outdated
@giortzisg giortzisg requested a review from Litarnus June 26, 2026 08:04
Comment thread data_collection_filter.go
Comment on lines +81 to +86
for _, t := range dc.HTTPBodies {
if t == bt {
return true
}
}
return false

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be return dc.HTTPBodies.Contains(bt)?

Comment thread data_collection_filter.go

// matchesDenyTerms reports whether the key (case-insensitive) contains any of
// the given terms as a substring.
func matchesDenyTerms(key string, terms []string) bool {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's a bit confusing that the name contains deny even though it's used for allow and deny. Maybe we can call it just matchesTerms?

@giortzisg giortzisg force-pushed the feat/data-collection branch from c976792 to a1f3a34 Compare June 29, 2026 12:40
@giortzisg giortzisg force-pushed the feat/data-denylist branch from 29a8d32 to c0138b5 Compare June 29, 2026 12:40
@giortzisg giortzisg force-pushed the feat/data-collection branch from a1f3a34 to 264848f Compare July 1, 2026 08:44
@giortzisg giortzisg force-pushed the feat/data-denylist branch from c0138b5 to 0afaade Compare July 1, 2026 08:44
@giortzisg giortzisg force-pushed the feat/data-collection branch from 264848f to eb379b3 Compare July 1, 2026 10:10
@giortzisg giortzisg force-pushed the feat/data-denylist branch from 0afaade to 0b51832 Compare July 1, 2026 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants