Skip to content

perf: add sort-serving segments index for KillUnusedSegments query#19645

Draft
jtuglu1 wants to merge 1 commit into
apache:masterfrom
jtuglu1:kill-unused-segments-sort-index
Draft

perf: add sort-serving segments index for KillUnusedSegments query#19645
jtuglu1 wants to merge 1 commit into
apache:masterfrom
jtuglu1:kill-unused-segments-sort-index

Conversation

@jtuglu1

@jtuglu1 jtuglu1 commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Description

KillUnusedSegments' per-datasource find-interval query
(SqlSegmentsMetadataQuery#retrieveUnusedSegmentIntervals) runs:

WHERE dataSource=? AND used=? AND end<=? [AND start>=?]
  AND used_status_last_updated<=? ORDER BY start, end LIMIT n

The existing (dataSource, used, end, start) index orders by end before start, so it
cannot serve ORDER BY start, end. EXPLAIN ANALYZE
measured ~11s. With ~50 datasources this duty runs at ~43s/cycle, bound basically by this
SQL call.

Baseline plan (no new index, ORDER BY start, end)

Query:

EXPLAIN ANALYZE
SELECT start, `end` FROM druid_segments
WHERE dataSource = '<datasource>' AND used = false
  AND `end` <= '<max-end>'
  AND used_status_last_updated IS NOT NULL
  AND used_status_last_updated <= '<buffer-cutoff>'
ORDER BY start, `end` LIMIT 1000;
-> Limit: 1000 row(s)  (actual time=10959..10959 rows=1000)
  -> Sort: start, `end`, limit input to 1000 row(s) per chunk  (actual time=10959..10959 rows=1000)
    -> Filter: (used = false AND `end` <= '<max-end>' AND used_status_last_updated <= '<buffer-cutoff>')
              (rows=418976)  (actual time=0.365..9765)
      -> Index lookup using IDX_<dataSource-only>  (rows=806095)  (actual time=0.362..9632)
~9.6s scanning, ~1.2s sorting, ~11s total.

Options Considered

  1. Create a new sort-serving index (this PR)

Add (dataSource, used, start, end, used_status_last_updated). The (dataSource, used)
equality prefix + (start, end) matches the ORDER BY, so the filesort is removed and the
LIMIT short-circuits; used_status_last_updated trailing makes the query covering. The
ORDER BY start, end is preserved, so kill semantics are unchanged.

The optimized plan was confirmed by running the symmetric query against the existing
(dataSource, used, end, start) index with ORDER BY end, start:

EXPLAIN ANALYZE
SELECT start, `end` FROM druid_segments
WHERE dataSource = '<datasource>' AND used = false
  AND `end` <= '<max-end>'
  AND used_status_last_updated IS NOT NULL
  AND used_status_last_updated <= '<buffer-cutoff>'
ORDER BY `end`, start LIMIT 1000;
-> Limit: 1000 row(s)  (actual time=0.382..15.5 rows=1000)
  -> Filter: (used_status_last_updated IS NOT NULL AND used_status_last_updated <= '<buffer-cutoff>')
            (rows=1000)  (actual time=0.381..15.4)
    -> Index range scan using IDX_<dataSource,used,end,start>
       over (dataSource = '<datasource>' AND used = 0 AND `end` <= '<max-end>')  (rows=1000)  (actual time=0.38..15.2)
~11,000ms → ~15ms. The new index produces this same plan for the unchanged
ORDER BY start, end.

With this change, we might? also be able to delete the old (dataSource, used, end, start) index as no other queries use it.

  1. Reformat the query to reuse an existing index

Flip the query to ORDER BY end, start so the existing (dataSource, used, end, start)
index serves it – this is the expected runtime of the 15ms plan measured above.

I opted not to go for this as it changes kill semantics in a way that breaks some behavior. KillUnusedSegments
drains earliest-start-first behind a start-based cursor (datasourceToLastKillIntervalEnd
--> next query filters start >= cursor), and limitToPeriod always retains the segments at the
earliest start. Making this safe requires reworking the drain to be end-consistent (cursor, ordering,
and limitToPeriod all keyed on end) which seemed like more work. Open to opinions.

Release note


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

…erval query

KillUnusedSegments' per-datasource find-interval query
(SqlSegmentsMetadataQuery#retrieveUnusedSegmentIntervals) runs
`WHERE dataSource=? AND used=? AND end<=? [AND start>=?]
 AND used_status_last_updated<=? ORDER BY start, end LIMIT n`.

The existing (dataSource, used, end, start) index orders by end before
start, so it cannot serve `ORDER BY start, end`. On a large datasource
(~470k unused segments) MySQL materializes all matching rows and filesorts
them just to return LIMIT n; EXPLAIN ANALYZE measured ~11s, with the scan
dominating. With ~50 datasources this drove the duty to ~43s/cycle, almost
all in the find-interval SQL.
@jtuglu1 jtuglu1 force-pushed the kill-unused-segments-sort-index branch from a3d5b17 to c343d33 Compare June 30, 2026 22:28
@jtuglu1 jtuglu1 requested a review from kfaraz June 30, 2026 22:49
@jtuglu1 jtuglu1 added this to the 38.0.0 milestone Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant