perf: Interval tree for managing segment metadata in memory by pirvtech · Pull Request #19138 · apache/druid

pirvtech · 2026-03-11T21:06:55Z

Segment metadata stored in memory of the Historicals, is used when looking up segments that match an interval for query and segment loading purposes. Currently this is a serial scan that goes through all segments metadata in ascending start time order to find the right matching segments.

This changes introduces an Interval Tree as a more efficient way to store segment metadata in memory, to speed up searches for segments, that can potentially cut down search times from O(n) to O(logn).

Core changes to file processing/src/main/java/org/apache/druid/timeline/VersionedIntervalTimeline.java
Interval tree implementation in file processing/src/main/java/org/apache/druid/timeline/IntervalTree.java
Documentation comments have been included in important files and sections of code. Unit tests added.

This has been reviewed internally and is in a production Druid cluster.

…g a given interval. Using it for finding segments loaded from cache.

…node

… of matches

…ement for...

…n finding lower and higher entries

jtuglu1 · 2026-03-11T22:09:20Z

Hi – thanks. Do you have benchmarks for this?

+  }
+
+  @Override
+  public T remove(Object key)


pirvtech · 2026-03-11T23:03:15Z

Hi – thanks. Do you have benchmarks for this?

I will add the benchmarks.

kfaraz

Thanks for putting this together, @pirvtech!
I have often felt the need for such a datastructure too, even outside the VIT (VersionedIntervalTimeline).

Thoughts on perf

I doubt if a single query is really going to benefit from this change since doing a contains() or an overlaps() check on say 25k intervals (which is a fairly large number of intervals for a typical Druid cluster) would not be very compute intensive.
- You can think of 25k intervals as roughly 3 years worth of HOUR-granularity data.
But in high concurrency, this would still be beneficial since the VIT does all of its computations inside a giant lock. The shorter we hold the lock, the better.
Either way, we should add benchmarks as @jtuglu1 suggested for queries as well as VIT itself.

Notes on the implementation

I have left some inline comments.

Along with that, the approach would be much cleaner if you do something like this instead:

Add an IntervalNavigableMap<T> interface which extends NavigableMap<Interval, T>. This interface should have the following new methods:
- entriesContaining(Interval interval)
- entriesOverlapping(Interval interval)
- entriesMatching(Interval interval) (Is this really needed?)
- Alternatively, instead of methods that return a sub map, you could have methods that return a matching entry set.
Instead of HashMap and NavigableMap, the VIT class should use this new interface.
Add a class IntervalTreeMap<T> implements IntervalNavigableMap<T> extends TreeMap<Interval, T> and provides default implementations for the new methods.
- For example, for the entriesContaining() method, we return the whole map.
Add a new class FastSearchIntervalMap<T> which performs the optimised search.
Based on value of the fastSearch flag passed to constructor of VIT, choose the implementation of the map.
This would ensure that there are minimal changes to VIT and we can easily swap out different map implementations.

pirvtech · 2026-03-18T18:49:43Z

@kfaraz thanks for the comments. We were facing a scenario where the number of segments were in the order of millions, due to the amount of parallelization we have in ingestion tasks, the time spread of data being ingested, and relative time offset when a time interval gets compacted. This change helped our historicals be able to load and serve segments faster. I had made another change to make the scanning and loading of local on-disk segments during historical startup multi-threaded from a single threa that it was, but looks like someone beat us to the punch in submission :). I will review your code feedback and provide my responses.

jtuglu1 · 2026-04-08T21:38:21Z

@kfaraz I think this PR will have much more impact than just Historical. See #19278. Broker timeline operations become terribly slow under high # of intervals. We can use the interval tree here to speed up those operations drastically and speed up queries under heavy segment load/remove callback load.

jtuglu1 · 2026-04-08T22:15:17Z

@pirvtech – are you still working on this patch? We'd love to get this in and experimenting on Broker-side to fix some issues we've run into with large #s of intervals in the VIT. Happy to pick this up otherwise.

kfaraz · 2026-04-11T02:20:49Z

Thanks for the clarification, @jtuglu1 !
Could you share an estimate of the typical number of intervals that causes the slowness? Were you able to capture flamegraphs?

Regardless, I agree that this feature would be generally useful. We just need to clean up the changes a bit.

jtuglu1 · 2026-04-13T16:59:56Z

Thanks for the clarification, @jtuglu1 ! Could you share an estimate of the typical number of intervals that causes the slowness? Were you able to capture flamegraphs?

Regardless, I agree that this feature would be generally useful. We just need to clean up the changes a bit.

I can get a flamegraph, but see the description of #19278 for explanation. Basically for a full year of MINUTE granularity you get ~500k intervals.

pirvtech · 2026-04-13T18:30:26Z

@pirvtech – are you still working on this patch? We'd love to get this in and experimenting on Broker-side to fix some issues we've run into with large #s of intervals in the VIT. Happy to pick this up otherwise.

@jtuglu1 yes was planning to send out updates to review comments this week. Let me know if you have any feedback as well.

jtuglu1 · 2026-04-13T18:40:42Z

@pirvtech – are you still working on this patch? We'd love to get this in and experimenting on Broker-side to fix some issues we've run into with large #s of intervals in the VIT. Happy to pick this up otherwise.

@jtuglu1 yes was planning to send out updates to review comments this week. Let me know if you have any feedback as well.

Great, thanks! Feel free to use my benchmarks in #19278. I think it'd be great to extend on those to summarize/benchmark the common operations with a variable size of intervals to ensure we're seeing the expected throughput/latency.

jtuglu1 · 2026-04-22T03:49:09Z

@pirvtech any updates here? I'm happy to get this merged if you're busy

pirvtech · 2026-04-22T18:46:01Z

@jtuglu1 thanks for the patience, submitting updates today.

pirvtech · 2026-04-23T21:22:45Z

Please see commend above on pushed config related changes @kfaraz

…erval search as well

pirvtech · 2026-05-06T21:30:37Z

Addressed comments from @FrankChen021 in new commit

FrankChen021

Severity	Findings
P0	0
P1	0
P2	0
P3	1
Total	1

This is an automated review by Codex GPT-5

…the search methods

FrankChen021

Reviewed 10 of 10 changed files. The follow-up adds full-traversal APIs and documents the range-pruning contract; current PR call sites use range-monotonic predicates, so no inline reply is needed.

This is an automated review by Codex GPT-5

pirvtech · 2026-05-11T18:44:55Z

Merged latest master into this branch and resolved conflicts.

pirvtech · 2026-05-11T19:21:57Z

What are the next steps?

FrankChen021

I have reviewed the code for correctness, edge cases, concurrency, and integration risks; no issues found.

Reviewed 10 of 10 changed files.

This is an automated review by Codex GPT-5.5

pirvtech · 2026-05-13T18:15:19Z

Any other comments @FrankChen021 @kfaraz @jtuglu1 or is it ready to merge.

pirvtech · 2026-05-27T19:07:19Z

Any updates from reviewers if it is ready to merge?

kfaraz

Thanks for updating the patch and for your patience, @pirvtech !

Overall, the changes look good. I have left some minor suggestions. A couple of files remain for review, will finish those soon as well.

Please also update the PR description to include a Release note section which describes the new config.

kfaraz · 2026-06-23T07:38:13Z

+ * Not thread safe.
+ * <p>
+ */
+public class IntervalTree<T> extends AbstractMap<Interval, T> implements NavigableMap<Interval, T>


Nit: Rename since this is essentially a navigable map inspired by Interval Trees.

Suggested change

public class IntervalTree<T> extends AbstractMap<Interval, T> implements NavigableMap<Interval, T>

public class IntervalTreeMap<T> extends AbstractMap<Interval, T> implements NavigableMap<Interval, T>

kfaraz · 2026-06-23T07:41:00Z

+  }
+
+  @VisibleForTesting
+  public String print()


Rename this to toTreeString (or use toString itself), add a javadoc, remove the @VisibleForTesting annotation.

kfaraz · 2026-06-23T09:28:37Z

+import java.util.function.Consumer;
+import java.util.stream.Collectors;
+
+public class IntervalTreeTest


Please use JUnit5 for the test.

kfaraz · 2026-06-23T09:29:28Z

+      Interval interval = pair.lhs;
+      tree.findEncompassing(interval);
+    }
+    System.out.println("Tree find time " + (System.currentTimeMillis() - start));


Please remove System.out statements from the test.

kfaraz · 2026-06-24T13:44:23Z

+
+import javax.annotation.Nullable;
+
+public class TimelineConfig


Instead of a new class, we should probably add the new config fastIntervalSearch in class SegmentMetadataQueryConfig which is bound as druid.query.segmentMetadata.

I considered this earlier but the interval tree is not only used during query processing but also in other cases when loading segments from disk.

Yeah, I suppose that's fair as we might want to use this on non-queryable nodes as well.
Please add a short javadoc though, and maybe rename to SegmentTimelineConfig.

kfaraz · 2026-06-24T13:55:31Z

+      if (fastIntervalSearch) {
+        Map<Interval, TreeMap<VersionType, TimelineEntry>> possibleMatches = allTimeIntervals.findEncompassing(interval);
+        for (Entry<Interval, TreeMap<VersionType, TimelineEntry>> entry : possibleMatches.entrySet()) {
+          Interval eninterval = entry.getKey();


Suggested change

Interval eninterval = entry.getKey();

Interval candidateInterval = entry.getKey();

pirvtech · 2026-07-01T18:48:39Z

Thanks for updating the patch and for your patience, @pirvtech !

Overall, the changes look good. I have left some minor suggestions. A couple of files remain for review, will finish those soon as well.

Please also update the PR description to include a Release note section which describes the new config.

@kfaraz will look into your comments and address them.

pirvtech and others added 20 commits March 11, 2026 11:04

Speed up searching of partitions within the in memory data source state

9cefd2e

Narrowing search space for an interval when an exact match is not found

143f095

Implemented an optimized data structure to find intervals encompassin…

d38750e

…g a given interval. Using it for finding segments loaded from cache.

Added rebalancing

ecb9f82

Added imbalance threshold to control when to trigger rebalancing

6789646

Added rebalance and data content checks

b55c9e2

Updated doc

54b0a4e

Cleaned up some comments and names

982b238

Overwriting value if there is an exact interval match with add

99d6628

Addressing review comments

ac4b8e0

Added feature flag to control use of interval tree for matching segments

42cafff

Using a single interval field for storing the min to max range for a …

de9138b

…node

Generified the match function, so it can be used with different types…

ee4dcab

… of matches

Addressed review comments

236c0e0

Removed commented code

e63e660

Added addition documentation

0badf8c

Updated doc

e076a74

Removed commented code

af17043

Cast IntervalTree as a NavigableMap so it can become a drop in replac…

ef483c4

…ement for...

Using both start and end dates of the interval during comparision whe…

b3c6791

…n finding lower and higher entries

github-advanced-security AI found potential problems Mar 11, 2026

View reviewed changes

kfaraz reviewed Mar 12, 2026

View reviewed changes

jtuglu1 mentioned this pull request Apr 8, 2026

perf: Speed up VersionedIntervalTimeline operations #19278

Draft

10 tasks

Made the index configurable via a timeline configuration parameter

01c477f

jtuglu1 self-requested a review April 23, 2026 16:08

FrankChen021 reviewed Apr 25, 2026

View reviewed changes

Comment thread processing/src/main/java/org/apache/druid/timeline/VersionedIntervalTimeline.java

Comment thread processing/src/main/java/org/apache/druid/timeline/IntervalTree.java Outdated

pirvtech added 2 commits May 6, 2026 12:45

Parameterized VersionedIntervalTimeline tests to run against fast int…

1d3ddc5

…erval search as well

Using comparators for exact match checks to take Chronology in account

2113d18

FrankChen021 reviewed May 7, 2026

View reviewed changes

Comment thread processing/src/main/java/org/apache/druid/timeline/IntervalTree.java Outdated

pirvtech added 2 commits May 7, 2026 16:37

Added ability to specify a separate condition for the range check

5c83921

Added methods for finding matches with full traversal and documented …

053423f

…the search methods

FrankChen021 reviewed May 9, 2026

View reviewed changes

Merge remote-tracking branch 'apache/master' into segment-interval-tree

2a6ae5e

Passing locale to fix forbidden api validation error

446962b

pirvtech changed the title ~~Interval tree for managing segment metadata in memory~~ perf: Interval tree for managing segment metadata in memory May 11, 2026

FrankChen021 reviewed May 12, 2026

View reviewed changes

kfaraz reviewed Jun 24, 2026

View reviewed changes

jtuglu1 added this to the 38.0.0 milestone Jun 24, 2026

pirvtech added 4 commits July 1, 2026 15:16

Removed comments and unused injection

51a1728

Merge remote-tracking branch 'apache/master' into segment-interval-tree

b07a191

Optimizations

1f9d4c4

Documentation and optimizations

6edc125

	public class IntervalTree<T> extends AbstractMap<Interval, T> implements NavigableMap<Interval, T>
	public class IntervalTreeMap<T> extends AbstractMap<Interval, T> implements NavigableMap<Interval, T>


		import javax.annotation.Nullable;

		public class TimelineConfig

	Interval eninterval = entry.getKey();
	Interval candidateInterval = entry.getKey();

Uh oh!

Conversation

pirvtech commented Mar 11, 2026

Uh oh!

jtuglu1 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Check notice

Uh oh!

pirvtech commented Mar 11, 2026

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pirvtech commented Mar 18, 2026

Uh oh!

jtuglu1 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jtuglu1 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kfaraz commented Apr 11, 2026

Uh oh!

jtuglu1 commented Apr 13, 2026

Uh oh!

pirvtech commented Apr 13, 2026

Uh oh!

jtuglu1 commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jtuglu1 commented Apr 22, 2026

Uh oh!

pirvtech commented Apr 22, 2026

Uh oh!

pirvtech commented Apr 23, 2026

Uh oh!

Uh oh!

Uh oh!

pirvtech commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

pirvtech commented May 11, 2026

Uh oh!

pirvtech commented May 11, 2026

Uh oh!

FrankChen021 left a comment

Choose a reason for hiding this comment

Uh oh!

pirvtech commented May 13, 2026

Uh oh!

pirvtech commented May 27, 2026

Uh oh!

kfaraz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfaraz Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kfaraz Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

kfaraz Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

jtuglu1 commented Mar 11, 2026 •

edited

Loading

jtuglu1 commented Apr 8, 2026 •

edited

Loading

jtuglu1 commented Apr 8, 2026 •

edited

Loading

jtuglu1 commented Apr 13, 2026 •

edited

Loading

pirvtech commented May 6, 2026 •

edited

Loading

kfaraz left a comment •

edited

Loading

kfaraz Jul 2, 2026 •

edited

Loading