Problem
Pagination.applyFilterAndPage uses a surplusFactor that grows linearly (++surplusFactor) when a database page yields zero filter matches. This means the page size grows as limit * 1, limit * 2, limit * 3, etc. When the filter match rate is very low (e.g., 1-5%), this linear ramp-up requires many round trips to the database before the fetch window is large enough to capture any matches. Each empty round trip is wasted I/O.
The inverse problem also exists: when a page does contain matches, the factor decreases by exactly 1 (--surplusFactor), ignoring how many matches were actually found. A page that matched 1 out of 100 records and a page that matched 99 out of 100 receive the same adjustment.
Concrete scenario
- 10,000 records in the database
- Filter match rate: 2% (200 records pass)
- Requested page size (
limit): 20
To fill one page of 20 matching records, the algorithm needs ~1,000 database records. With surplusFactor incrementing by 1 each empty page and page size = limit * surplusFactor:
- Iteration 1: fetches 20 records, ~0 matches, factor becomes 2
- Iteration 2: fetches 40 records, ~1 match, factor stays 2
- Iteration 3: fetches 40 records, ~1 match, factor stays 2
- ...many more iterations needed
This produces excessive database round trips that could be avoided.
Where it matters
This affects every code path that composes a client-side filter with pagination:
- Audience framework:
$checkIfVisible() is composed via .and() into the filter predicate for every find/load call, making every paginated Audience query subject to this behavior.
Runway.filter() / Runway.filterAny(): used when Criteria cannot be resolved server-side.
DatabaseInterface default methods: any find/load overload accepting both a Predicate filter and a Page.
Proposed Fix
Replace the linear surplusFactor increment with a match-rate-based calculation that estimates the page size needed to fill the remaining slots in one fetch.
Algorithm
After processing each database page, compute:
matchRate = matchesThisPage / unfilteredPageSize
remaining = limit - count
nextPageSize = ceil(remaining / matchRate)
This jumps directly to the right ballpark instead of crawling toward it.
Edge cases
| Condition |
Handling |
| Zero matches on a page (matchRate = 0) |
Cannot compute a rate. Double the current page size as a geometric fallback. This is still significantly faster than +1. |
| Overshoot (calculated size is very large) |
Cap at a configurable maximum (e.g., limit * MAX_SURPLUS_FACTOR where MAX_SURPLUS_FACTOR could be 50 or similar) to prevent pathological memory/network usage. |
| Very small unfiltered page (e.g., 1 record) |
The rate estimate from a tiny sample is unreliable. The cap handles the worst case; subsequent pages will refine the estimate. |
Pseudocode
Set<T> records = new LinkedHashSet<>();
int offset = page.offset();
int limit = page.limit();
int count = 0;
int skipped = 0;
int nextSize = limit;
page = Page.of(0, nextSize);
outer: while (count < limit) {
Set<T> unfiltered = function.apply(page);
if(unfiltered.isEmpty()) {
break;
}
int matched = 0;
for (T record : unfiltered) {
if(filter.test(record)) {
if(skipped < offset) {
++skipped;
}
else {
records.add(record);
++count;
++matched;
if(count == limit) {
break outer;
}
}
}
}
int remaining = limit - count;
if(matched == 0) {
// No signal -- geometric growth as fallback
nextSize = Math.min(nextSize * 2, limit * MAX_SURPLUS_FACTOR);
}
else {
// Estimate based on observed match rate
double matchRate = (double) matched / unfiltered.size();
nextSize = (int) Math.ceil(remaining / matchRate);
nextSize = Math.min(nextSize, limit * MAX_SURPLUS_FACTOR);
nextSize = Math.max(nextSize, limit);
}
page = page.next().size(nextSize);
}
return records;
Semantic change
surplusFactor (a multiplier on limit) is replaced by nextSize (the actual page size to request). This is cleaner since the factor was only ever used in page.size(limit * surplusFactor).
Unit Test Plan
The existing PaginationTest.testApplyFilterAndPage covers the basic correctness case (50% match rate, iterating through all pages). The following additional test cases should be added:
1. testApplyFilterAndPageWithLowMatchRate
- Setup: 1,000 items, filter passes ~2% (e.g.,
item % 50 == 0).
- Verify: Page of 10 returns exactly the correct 10 items for each page. All pages are filled correctly across the full dataset.
- Purpose: Exercises the match-rate estimation and geometric fallback.
2. testApplyFilterAndPageWithNoMatches
- Setup: 100 items, filter rejects everything.
- Verify: Returns an empty set (does not loop infinitely).
- Purpose: Ensures the
unfiltered.isEmpty() termination condition works correctly when the filter eliminates all records.
3. testApplyFilterAndPageWithAllMatches
- Setup: 100 items, filter passes everything.
- Verify: Pagination returns the same results as un-filtered pagination.
- Purpose: Confirms no off-by-one errors when the filter is a no-op.
4. testApplyFilterAndPageMatchRateDropsMidStream
- Setup: 200 items where the first 100 have a 50% match rate and the last 100 have a 2% match rate (e.g., items 1-100 match if even, items 101-200 match if divisible by 50).
- Verify: Pages that span the boundary are still correctly filled.
- Purpose: Validates that the rate-based estimate adapts when the match rate changes across pages.
5. testApplyFilterAndPageWithOffsetBeyondMatches
- Setup: 50 items, filter passes 10. Page with offset > 10.
- Verify: Returns an empty set.
- Purpose: Ensures offset skipping terminates correctly when there are not enough matches.
6. testApplyFilterAndPageSingleItemPages
- Setup: 20 items,
limit = 1, filter passes ~50%.
- Verify: Each single-item page returns the correct next matching item.
- Purpose: Edge case where
limit is 1 and the algorithm must advance correctly one match at a time.
Problem
Pagination.applyFilterAndPageuses asurplusFactorthat grows linearly (++surplusFactor) when a database page yields zero filter matches. This means the page size grows aslimit * 1,limit * 2,limit * 3, etc. When the filter match rate is very low (e.g., 1-5%), this linear ramp-up requires many round trips to the database before the fetch window is large enough to capture any matches. Each empty round trip is wasted I/O.The inverse problem also exists: when a page does contain matches, the factor decreases by exactly 1 (
--surplusFactor), ignoring how many matches were actually found. A page that matched 1 out of 100 records and a page that matched 99 out of 100 receive the same adjustment.Concrete scenario
limit): 20To fill one page of 20 matching records, the algorithm needs ~1,000 database records. With
surplusFactorincrementing by 1 each empty page and page size =limit * surplusFactor:This produces excessive database round trips that could be avoided.
Where it matters
This affects every code path that composes a client-side filter with pagination:
$checkIfVisible()is composed via.and()into the filter predicate for everyfind/loadcall, making every paginated Audience query subject to this behavior.Runway.filter()/Runway.filterAny(): used whenCriteriacannot be resolved server-side.DatabaseInterfacedefault methods: anyfind/loadoverload accepting both aPredicatefilter and aPage.Proposed Fix
Replace the linear
surplusFactorincrement with a match-rate-based calculation that estimates the page size needed to fill the remaining slots in one fetch.Algorithm
After processing each database page, compute:
This jumps directly to the right ballpark instead of crawling toward it.
Edge cases
+1.limit * MAX_SURPLUS_FACTORwhereMAX_SURPLUS_FACTORcould be 50 or similar) to prevent pathological memory/network usage.Pseudocode
Semantic change
surplusFactor(a multiplier onlimit) is replaced bynextSize(the actual page size to request). This is cleaner since the factor was only ever used inpage.size(limit * surplusFactor).Unit Test Plan
The existing
PaginationTest.testApplyFilterAndPagecovers the basic correctness case (50% match rate, iterating through all pages). The following additional test cases should be added:1.
testApplyFilterAndPageWithLowMatchRateitem % 50 == 0).2.
testApplyFilterAndPageWithNoMatchesunfiltered.isEmpty()termination condition works correctly when the filter eliminates all records.3.
testApplyFilterAndPageWithAllMatches4.
testApplyFilterAndPageMatchRateDropsMidStream5.
testApplyFilterAndPageWithOffsetBeyondMatches6.
testApplyFilterAndPageSingleItemPageslimit = 1, filter passes ~50%.limitis 1 and the algorithm must advance correctly one match at a time.