[GH-2809] Support distance joins for raster predicates#2980
Draft
jiayuasu wants to merge 1 commit into
Draft
Conversation
Add `RS_DWithin(raster|geom, raster|geom, distance)` so distance joins can use raster operands, and route the join planner through the existing spatial-index machinery. - `RS_DWithin` expression in `RasterPredicates.scala`, backed by new `RasterPredicates.rsDWithin` overloads (raster-geom, raster-raster) that reuse `convertCRSIfNeeded` and JTS `isWithinDistance`. - `JoinQueryDetector` and `OptimizableJoinCondition` recognise `RS_DWithin` as a distance-join predicate; the relationship label collapses to `RS_DWithin` for all raster + distance cases. - `BroadcastIndexJoinExec.createStreamShapes` and the new `TraitJoinQueryBase.toExpandedWGS84EnvelopeRDD` handle the raster stream and build sides for broadcast-index joins; `SpatialIndexExec` and `DistanceJoinExec` route to the same helper so non-broadcast distance joins work too. - Drop the placeholder `UnsupportedOperationException` guards for distance + raster combinations; geography + raster + distance remains guarded since the geography refiner does not handle raster shapes. Tests - `BroadcastIndexJoinSuite`: `RS_DWithin` covers stream-raster / broadcast-raster / swapped-operand forms. - `RasterJoinSuite`: new `RS_DWithin distance join` describe block covers `DistanceJoinExec` with both partition-side configs, swapped operands, and raster-raster. Docs - New `docs/api/sql/Raster-Predicates/RS_DWithin.md` page. - `Raster-Functions.md` predicate table row. - `Optimizer.md` raster-distance-join subsection.
53602f9 to
1e23c81
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a ticket?
[GH-XXX] my subject. Closes Support distance joins for raster predicates #2809What changes were proposed in this PR?
Adds an
RS_DWithin(left, right, distance)predicate so distance joins can use raster operands, and routes the join planner through the same spatial-index machinery used forRS_Intersects/ST_DWithin.RS_DWithinSQL function with three overloads (raster + geom,geom + raster,raster + raster), backed byRasterPredicates.rsDWithin(CRS conversion via the existingconvertCRSIfNeeded, JTSisWithinDistancefor the per-row check).JoinQueryDetectorandOptimizableJoinConditiontreatRS_DWithinas a distance-join predicate. Broadcast plans go throughBroadcastIndexJoinExec; non-broadcast plans go throughDistanceJoinExec.BroadcastIndexJoinExec.createStreamShapes,SpatialIndexExec, andDistanceJoinExecnow project raster shapes to WGS84 envelopes (the same pathRS_Intersectsalready uses) and expand by the radius before the R-tree filter. The new helper lives inTraitJoinQueryBase.toExpandedWGS84EnvelopeRDD.UnsupportedOperationExceptionfor distance + raster combinations. Geography + raster + distance remains guarded since the geography refiner doesn't accept raster shapes yet.How was this patch tested?
BroadcastIndexJoinSuite: newPassed RS_DWithintest exercises stream-raster, broadcast-raster, and swapped-operand forms.RasterJoinSuite: newRS_DWithin distance joindescribe block coversDistanceJoinExecwith both partition-side configs, swapped operands, and raster-raster.-Dspark=3.4 -Pscala2.12.Did this PR include necessary documentation updates?
v1.9.1in theSincefield.docs/api/sql/Raster-Predicates/RS_DWithin.md(intro, CRS rules, all three signatures, SQL example, join-planning note).Raster-Functions.md: predicate table row forRS_DWithin.Optimizer.md: new "Raster distance join" subsection with broadcast and non-broadcast SQL examples.