feat: implement retract_batch for array_agg sliding window support#22015
feat: implement retract_batch for array_agg sliding window support#22015SubhamSinghal wants to merge 1 commit intoapache:mainfrom
Conversation
There was a problem hiding this comment.
@SubhamSinghal
Thanks for the work here. I found one issue that looks like it can affect sliding window correctness for array_agg(... IGNORE NULLS)
|
|
||
| let val = &values[0]; | ||
| let mut to_retract = if self.ignore_nulls { | ||
| val.len() - val.null_count() |
There was a problem hiding this comment.
I think retract_batch should use the same null definition as update_batch here.
update_batch filters IGNORE NULLS using val.logical_nulls(), but this path counts non-null rows with val.len() - val.null_count(). For arrays where logical nullability differs from physical nullability, such as dictionary or run arrays whose values contain logical nulls, the accumulator can store fewer rows than retract_batch later removes.
That means a sliding array_agg(... IGNORE NULLS) window can over-retract and accidentally drop following values from the frame. Could we switch this to the same null semantics, for example val.len() - val.logical_null_count(), and add a regression test with an input type where logical nulls differ from null_count()?
Which issue does this PR close?
Closes #21957.
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Yes. Using UT
Are there any user-facing changes?
Yes.
array_aggnow works with bounded/sliding window frames. Queries that previously errored now succeed:No breaking API changes