Fix coordinator stalls from large role transactions#37380
Open
SangJunBak wants to merge 3 commits into
Open
Conversation
TableTransaction ran an O(n) uniqueness scan over the whole collection on every update_by_key/set/set_many call, even when the update could not change uniqueness (a privilege or owner change leaves name/schema/type untouched). Bulk operations that update the same objects repeatedly, such as GRANT ... ON ALL TABLES IN SCHEMA, call this in a loop, turning a single catalog transaction into O(ops * catalog_items) work and wedging the single-threaded coordinator. Add a second per-collection predicate, has_unique_key_changed, alongside uniqueness_violation. An update is scanned only when it changed a field uniqueness_violation reads.
GRANT/REVOKE expands to one Op::UpdatePrivilege per (target, grantee), so GRANT ... ON ALL TABLES IN SCHEMA s TO r1,...,rN produced N_tables * N_grantees ops, each rewriting its target object durably. Carry a batch of MzAclItems per target in Op::UpdatePrivilege and apply them with a single durable write per target, collapsing the cross product to one write per object. Audit events stay one per grantee.
… during bulk GRANT Adds a BulkPrivilegeGrant parallel-benchmark scenario: a background thread runs GRANT/REVOKE SELECT ON ALL TABLES IN SCHEMA to 150 roles back to back over 150 tables, while ten closed loops measure SELECT 1 latency. Asserts SELECT 1 max latency stays under 30s. Before the durable-layer and op-batching fixes a single bulk grant wedged the coordinator for minutes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Let
nbe the number of catalog itemsIn this PR, we do two optimizations:
Motivation
Fixes sql-421
Verification
Verified by doing Dennis' repro in the ticket. But also I created a parallel workload for it and tested it before and after the change. The fact that it works after means we're no longer blocking for minutes