fix: Apply timezone offset to convert local time boundaries to UTC#37014
fix: Apply timezone offset to convert local time boundaries to UTC#37014jayakarkatika wants to merge 5 commits into
Conversation
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Code Review Agent Run #4da53fActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Nitpicks 🔍
|
| # Example: User selects "Last day" which gives midnight local time | ||
| # If offset=6 (UTC+6), we subtract 6 hours to get UTC time | ||
| offset_hours = getattr(self, "offset", 0) or 0 | ||
| if offset_hours and start_dttm: |
There was a problem hiding this comment.
Suggestion: The value returned by self.offset may not be an integer (it could be a string or None) and passing a non-numeric value to timedelta(hours=...) will raise a TypeError; coerce the offset to an int and fallback to 0 on failure. [type error]
Severity Level: Minor
| if offset_hours and start_dttm: | |
| try: | |
| offset_hours = int(offset_hours) | |
| except (TypeError, ValueError): | |
| # If offset cannot be converted to int, treat it as no offset | |
| offset_hours = 0 |
Why it matters? ⭐
The current code assumes self.offset is numeric. If a subclass or consumer accidentally provides a string (e.g. "6") or other non-numeric value, timedelta(hours=offset_hours) will raise TypeError. Coercing to int with a safe fallback prevents an unexpected runtime crash. The change is local and defensive.
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2396:2396
**Comment:**
*Type Error: The value returned by `self.offset` may not be an integer (it could be a string or None) and passing a non-numeric value to `timedelta(hours=...)` will raise a TypeError; coerce the offset to an int and fallback to 0 on failure.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| # If offset=6 (UTC+6), we subtract 6 hours to get UTC time | ||
| offset_hours = getattr(self, "offset", 0) or 0 | ||
| if offset_hours and start_dttm: | ||
| start_dttm = start_dttm - timedelta(hours=offset_hours) |
There was a problem hiding this comment.
Suggestion: Subtracting a timedelta from a string or applying a numeric offset to a timezone-aware datetime will raise or double-adjust the value; parse string inputs to datetime and, if the datetime already has tzinfo, convert it to UTC instead of applying the numeric offset. [logic error]
Severity Level: Minor
| start_dttm = start_dttm - timedelta(hours=offset_hours) | |
| # Accept string inputs as datetimes | |
| if isinstance(start_dttm, str): | |
| start_dttm = dateutil.parser.parse(start_dttm) | |
| # If datetime is timezone-aware, convert directly to UTC (do not apply offset again) | |
| if getattr(start_dttm, "tzinfo", None): | |
| start_dttm = start_dttm.astimezone(pytz.utc) | |
| else: |
Why it matters? ⭐
Subtracting a timedelta from a string will raise, and subtracting hours from a tz-aware datetime can double-adjust the timestamp or be incorrect. Parsing string inputs and converting tz-aware datetimes properly to UTC (via astimezone) addresses both problems. The project already imports dateutil.parser and pytz, so the suggested code is implementable and fixes real edge-cases.
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2397:2397
**Comment:**
*Logic Error: Subtracting a timedelta from a string or applying a numeric offset to a timezone-aware datetime will raise or double-adjust the value; parse string inputs to datetime and, if the datetime already has tzinfo, convert it to UTC instead of applying the numeric offset.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| if offset_hours and start_dttm: | ||
| start_dttm = start_dttm - timedelta(hours=offset_hours) | ||
| if offset_hours and end_dttm: | ||
| end_dttm = end_dttm - timedelta(hours=offset_hours) |
There was a problem hiding this comment.
Suggestion: The same issues apply to end_dttm: subtracting a timedelta from a string or from a timezone-aware datetime can raise errors or double-adjust the timezone; parse strings and convert tz-aware datetimes to UTC instead of applying the numeric offset. [logic error]
Severity Level: Minor
| end_dttm = end_dttm - timedelta(hours=offset_hours) | |
| # Accept string inputs as datetimes | |
| if isinstance(end_dttm, str): | |
| end_dttm = dateutil.parser.parse(end_dttm) | |
| # If datetime is timezone-aware, convert directly to UTC (do not apply offset again) | |
| if getattr(end_dttm, "tzinfo", None): | |
| end_dttm = end_dttm.astimezone(pytz.utc) | |
| else: |
Why it matters? ⭐
Same rationale as for start_dttm: string inputs would raise on subtraction and tz-aware datetimes should be converted to UTC instead of blindly subtracting hours. Handling end_dttm the same way avoids asymmetric behavior between start and end.
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2399:2399
**Comment:**
*Logic Error: The same issues apply to `end_dttm`: subtracting a timedelta from a string or from a timezone-aware datetime can raise errors or double-adjust the timezone; parse strings and convert tz-aware datetimes to UTC instead of applying the numeric offset.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.|
CodeAnt AI finished reviewing your PR. |
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #37014 +/- ##
==========================================
- Coverage 64.28% 64.28% -0.01%
==========================================
Files 2659 2659
Lines 144304 144334 +30
Branches 33260 33266 +6
==========================================
+ Hits 92773 92783 +10
- Misses 49902 49919 +17
- Partials 1629 1632 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
🎪 Showtime deployed environment on GHA for 5e3504a • Environment: http://52.88.55.243:8080 (admin/admin) |
|
Can we add and/or update tests, if this was broken previously, but works now? |
- This keeps the backwards compatibility
|
I realize that the original commit might break charts which depend on current behavior. |
| or None if not configured. | ||
|
|
||
| """ | ||
| return self.extra_dict.get("timezone") |
There was a problem hiding this comment.
Suggestion: The new get_dataset_timezone uses self.extra_dict which doesn't exist on the mixin or most datasource classes and will raise AttributeError at runtime; use available accessors (get_extra_dict or the extra property) to read the JSON extras safely. [null pointer]
Severity Level: Critical 🚨
- ❌ Queries error when dataset timezone reading fails.
- ❌ Dashboard time-based queries return HTTP 500.
- ⚠️ Timezone conversion feature becomes non-functional.| return self.extra_dict.get("timezone") | |
| # Prefer dataset helper if available, otherwise fall back to `extra` property. | |
| if hasattr(self, "get_extra_dict") and callable(getattr(self, "get_extra_dict")): | |
| extras = self.get_extra_dict() or {} | |
| else: | |
| extras = getattr(self, "extra", {}) or {} | |
| return extras.get("timezone") |
Steps of Reproduction ✅
1. Trigger a query that executes ExploreMixin.normalize_df in superset/models/helpers.py
(hunk around lines 1247-1282). The normalize_df implementation calls dataset_timezone =
self.get_dataset_timezone() at superset/models/helpers.py:1270.
2. Execution enters get_dataset_timezone at superset/models/helpers.py:895-903 which
returns self.extra_dict.get("timezone"). The attribute `extra_dict` does not exist on
ExploreMixin or ExtraJSONMixin in this file.
3. Python raises AttributeError("'ExploreMixin' object has no attribute 'extra_dict'") at
superset/models/helpers.py:903, propagating back to the query path (e.g., get_query_result
-> normalize_df) and causing the request to fail.
4. The same broken accessor is also invoked from get_time_filter
(superset/models/helpers.py:2406) when building time filters; any call path that resolves
to get_dataset_timezone will reproduce the AttributeError. This is not a stylistic issue —
it's a real missing attribute that will raise at runtime.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 903:903
**Comment:**
*Null Pointer: The new `get_dataset_timezone` uses `self.extra_dict` which doesn't exist on the mixin or most datasource classes and will raise AttributeError at runtime; use available accessors (`get_extra_dict` or the `extra` property) to read the JSON extras safely.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.| if _col.offset: | ||
| df[_col.col_label] += timedelta(hours=_col.offset) | ||
| elif _col.offset: | ||
| df[_col.col_label] += timedelta(hours=_col.offset) |
There was a problem hiding this comment.
Suggestion: Offset handling is incorrect: the code adds the configured offset hours to the datetime (using += timedelta(hours=_col.offset)) which moves times in the wrong direction when converting local boundaries to UTC; additionally, the code treats offset 0 as falsy so a valid zero offset is ignored. Change the logic to subtract the offset (and check is not None) so offsets of 0 are handled. [logic error]
Severity Level: Critical 🚨
- ❌ Dashboard time filters produce incorrect UTC boundaries.
- ❌ Queries to /data return wrong time ranges.
- ⚠️ Charts and aggregations misaligned by timezone offset.
- ⚠️ Dataset timezone/offset settings produce incorrect results.| if _col.offset: | |
| df[_col.col_label] += timedelta(hours=_col.offset) | |
| elif _col.offset: | |
| df[_col.col_label] += timedelta(hours=_col.offset) | |
| if _col.offset is not None: | |
| # Subtract offset to convert local time to UTC (offset is hours ahead) | |
| df[_col.col_label] -= timedelta(hours=_col.offset) | |
| elif _col.offset is not None: | |
| # Subtract configured offset (support zero offset) | |
| df[_col.col_label] -= timedelta(hours=_col.offset) |
Steps of Reproduction ✅
1. Open a Python REPL or test and import normalize_dttm_col from superset/utils/core.py
(function begins at file line ~1934 where `def normalize_dttm_col(` is defined).
2. Construct a DataFrame with a datetime column named "__timestamp", e.g. in code at
runtime by:
- df = pd.DataFrame({"__timestamp": pd.to_datetime(["2026-01-09 00:00:00"])})
3. Create a DateColumn instance with offset set (example using class defined around lines
~1847 in superset/utils/core.py):
- from superset.utils.core import DateColumn
- col = DateColumn(col_label="__timestamp", offset=6)
4. Call normalize_dttm_col(df, (col,)) (function at superset/utils/core.py: ~1934).
Observe resulting df["__timestamp"] has been increased by +6 hours (current code does +=
timedelta), whereas the PR description and intended behavior expect the local midnight
(UTC+6) to be converted to UTC by subtracting 6 hours. Also observe that setting offset=0
(DateColumn(offset=0)) will be ignored by current code because `if _col.offset:` treats 0
as falsy, so no adjustment is applied.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/utils/core.py
**Line:** 1974:1977
**Comment:**
*Logic Error: Offset handling is incorrect: the code adds the configured `offset` hours to the datetime (using `+= timedelta(hours=_col.offset)`) which moves times in the wrong direction when converting local boundaries to UTC; additionally, the code treats offset `0` as falsy so a valid zero offset is ignored. Change the logic to subtract the offset (and check `is not None`) so offsets of `0` are handled.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Code Review Agent Run #63ebafActionable Suggestions - 0Additional Suggestions - 1
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Code Review Agent Run #4390eeActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
|
Would love to see this one merged, but there are a ton of bot review comments that are still unaddressed, and I'd still like to see added coverage for get_time_filter with a configured timezone (boundaries converted to UTC) and the invalid-tz fallback? |
Adds unit tests for normalize_dttm_col's new timezone path (UTC→dataset tz, DST handling, precedence over offset, invalid-tz fallback). Also reads extra_dict defensively in get_dataset_timezone — it's provided by concrete datasources (SqlaTable), not ExploreMixin, so the bare self.extra_dict tripped mypy and would AttributeError on other ExploreMixin subclasses. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Thanks @jayakarkatika! Timezone-aware beats a fixed offset here since it handles DST, so I'm into the direction. I pushed two things to your branch: unit tests for the |
fix(dashboard): Apply timezone offset to convert local time boundaries to UTC
SUMMARY
If the database has times stored in UTC, with no timezone information in the field itself, the last day / last week, etc do not work as expected.
The fix is to apply timezone offset using extra timezone attribute to convert local time boundaries to UTC
TESTING INSTRUCTIONS
Add {"timezone": "Europe/Berlin"} in extra field in dataset -> edit -> settings
Go to dashboard and select time range: last day
check the network tab.
If today's date is 2026-01-10
/data should show query as:
WHERE last_modified >= STR_TO_DATE('2026-01-08 23:00:00.000000', '%Y-%m-%d %H:%i:%s.%f') AND last_modified < STR_TO_DATE('2026-01-09 23:00:00.000000', '%Y-%m-%d %H:%i:%s.%f') GROUP BY