Skip to content

Derive seasonal lags from sensor resolution#2157

Open
BelhsanHmida wants to merge 18 commits into
mainfrom
feat/forecast-resolution-aware-lags
Open

Derive seasonal lags from sensor resolution#2157
BelhsanHmida wants to merge 18 commits into
mainfrom
feat/forecast-resolution-aware-lags

Conversation

@BelhsanHmida
Copy link
Copy Markdown
Contributor

@BelhsanHmida BelhsanHmida commented May 9, 2026

Description

  • Make the LightGBM forecaster's seasonal lag configurable.
  • Derive daily seasonal lag steps from the target sensor resolution.
  • Improves daily-seasonality handling for sub-hourly sensors such as PT15M sensors.
  • add a changelog entry for this PR.

Look & Feel

forecast improvement for 3 day case:

  • Before:
visualization (8) - After: visualization (9)

How to test

  • uv run pytest flexmeasures/data/tests/test_forecasting_pipeline.py -q
  • pre-commit run --all-files

Further Improvements

  • Add validation or warnings when the available training history is too short for the requested forecast horizon.

Related Items

Sign-off

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on code under GPL or other license that is incompatible with FlexMeasures

Context:
- Forecasting daily seasonality was hard-coded to 24 lag steps.
- That only represents one day for hourly sensors.

Change:
- Add a seasonal_lag_steps parameter to CustomLGBM.
- Keep the previous 24-step default for compatibility.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Context:
- Daily lag steps depend on the target sensor resolution.
- PT15M sensors need 96 steps to represent one day, not 24.

Change:
- Compute one-day lag steps from the target sensor event resolution.
- Pass the derived value into CustomLGBM during training.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Context:
- Resolution-aware daily lags need enough training samples for the full requested horizon.
- Three days of PT15M data with a 48h horizon used to produce forecasts on main and should not become a hard failure.

Change:
- Fall back to the legacy 24-step lag pattern when the training window cannot support daily lags for the farthest horizon.
- Add regression coverage for under-sampled and sufficiently sampled PT15M histories.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
@BelhsanHmida BelhsanHmida self-assigned this May 9, 2026
@BelhsanHmida BelhsanHmida marked this pull request as ready for review May 15, 2026 11:13
Signed-off-by: Mohamed Belhsan Hmida <149331360+BelhsanHmida@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the LightGBM forecasting model so daily seasonal lag features can scale with the target sensor resolution, improving support for sub-hourly forecasting.

Changes:

  • Adds configurable seasonal lag parameters to CustomLGBM.
  • Derives daily lag steps from target_sensor.event_resolution during training.
  • Adds a unit test for fallback behavior when history is too short for the requested daily lag.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Adds seasonal lag configuration and fallback logic.
flexmeasures/data/models/forecasting/pipelines/train.py Passes derived daily lag steps and training sample count into CustomLGBM.
flexmeasures/data/tests/test_forecasting_pipeline.py Tests fallback vs daily seasonal lag selection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread flexmeasures/data/models/forecasting/pipelines/train.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Context:
- Copilot noted externally configurable lag steps could be zero or negative.
- Invalid values would break modulo-based lag setup.

Change:
- Reject seasonal and fallback lag step values below 1.
- Fix the nearby seasonal-lag comment typo.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Context:
- Dividing one day by sensor resolution with int() silently truncated non-divisor resolutions.
- That could train on an offset that was close to, but not exactly, one day.

Change:
- Add a helper that derives daily lag steps only for resolutions that divide one day evenly.
- Fall back to the legacy 24-step lag pattern with a warning otherwise.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Context:
- Review feedback requested validation around configurable lag steps and daily lag derivation.
- The fallback behavior needs explicit coverage to avoid regressions.

Change:
- Cover invalid seasonal and fallback lag step values.
- Cover daily lag derivation for divisible and non-divisible sensor resolutions.

Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
@BelhsanHmida BelhsanHmida requested a review from Flix6x May 19, 2026 06:53
@BelhsanHmida
Copy link
Copy Markdown
Contributor Author

test results on 4 days

  • Before:
visualization (15)
  • After:
visualization (14)

@Flix6x Flix6x added this to the 0.33.0 milestone May 21, 2026
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/pipelines/train.py
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
@BelhsanHmida BelhsanHmida requested a review from Flix6x May 21, 2026 16:25
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
@BelhsanHmida BelhsanHmida requested a review from Flix6x May 21, 2026 17:33
Copy link
Copy Markdown
Contributor

@Flix6x Flix6x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great direction. I think the function signature can be simplified somewhat now.

Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Comment thread flexmeasures/data/models/forecasting/custom_models/lgbm_model.py Outdated
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
Signed-off-by: Mohamed Belhsan Hmida <mohamedbelhsanhmida@gmail.com>
@BelhsanHmida BelhsanHmida requested a review from Flix6x May 21, 2026 22:59
Copy link
Copy Markdown
Contributor

@Flix6x Flix6x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest a simplification and an extended docstring with a clear example and visualisation. Could you please check if the tests would still pass and whether the forecasts are still of equal quality?

Comment on lines +151 to +160
{
-1,
*(
darts_lag
for seasonal_lag_steps in eligible_seasonal_lags_steps
for darts_lag in self._lags_for_horizon(
horizon, self.max_forecast_horizon, seasonal_lag_steps
)
),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is part of a set of simplifications that I believe to be logically equivalent.

Suggested change
{
-1,
*(
darts_lag
for seasonal_lag_steps in eligible_seasonal_lags_steps
for darts_lag in self._lags_for_horizon(
horizon, self.max_forecast_horizon, seasonal_lag_steps
)
),
}
{
darts_lag
for seasonal_lag_steps in eligible_seasonal_lags_steps
for darts_lag in self._lags_for_horizon(
horizon, self.max_forecast_horizon, seasonal_lag_steps
),
}

from flexmeasures.data.models.forecasting.custom_models.base_model import BaseModel


DEFAULT_SEASONAL_LAGS_STEPS = [24]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is part of a set of simplifications that I believe to be logically equivalent.

Suggested change
DEFAULT_SEASONAL_LAGS_STEPS = [24]
DEFAULT_SEASONAL_LAGS_STEPS = [1, 24]

:param use_past_covariates: Whether past covariates are used for fitting and prediction.
:param use_future_covariates: Whether future covariates are used for fitting and prediction.
:param ensure_positive: Whether negative predictions should be clipped to zero.
:param seasonal_lags_steps: Candidate seasonal lag steps to keep if enough training samples remain.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is part of a set of simplifications that I believe to be logically equivalent.

Suggested change
:param seasonal_lags_steps: Candidate seasonal lag steps to keep if enough training samples remain.
:param seasonal_lags_steps: Candidate seasonal lag steps to keep if enough training samples remain. Include 1 in the list to account for the most recent observation (recommended).

Comment on lines +119 to +136
@staticmethod
def _lags_for_horizon(
horizon: int, max_forecast_horizon: int, seasonal_lag_steps: int
) -> list[int]:
"""Return Darts lags for one seasonal cycle at the given forecast horizon."""
lag_steps = seasonal_lag_steps - (horizon % seasonal_lag_steps)
darts_lags = [-lag_steps, -lag_steps - 1]

if (
horizon == 0
or horizon % seasonal_lag_steps == 0
or horizon == max_forecast_horizon - 1
):
darts_lags = [-seasonal_lag_steps]
elif horizon % seasonal_lag_steps == seasonal_lag_steps - 1:
darts_lags = [-2]

return darts_lags
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggestion is part of a set of simplifications that I believe to be logically equivalent.

Suggested change
@staticmethod
def _lags_for_horizon(
horizon: int, max_forecast_horizon: int, seasonal_lag_steps: int
) -> list[int]:
"""Return Darts lags for one seasonal cycle at the given forecast horizon."""
lag_steps = seasonal_lag_steps - (horizon % seasonal_lag_steps)
darts_lags = [-lag_steps, -lag_steps - 1]
if (
horizon == 0
or horizon % seasonal_lag_steps == 0
or horizon == max_forecast_horizon - 1
):
darts_lags = [-seasonal_lag_steps]
elif horizon % seasonal_lag_steps == seasonal_lag_steps - 1:
darts_lags = [-2]
return darts_lags
@staticmethod
def _lags_for_horizon(
horizon: int,
max_forecast_horizon: int,
seasonal_lag_steps: int,
) -> list[int]:
"""Build Darts target lags for a forecasting horizon.
For a forecast target at horizon ``h`` and a seasonal period ``s``, the aligned seasonal reference point is:
(t + h) - s
expressed relative to prediction origin ``t``.
The corresponding aligned Darts lag ``l`` is therefore:
l = -(s - (h % s))
where the modulo wraps the horizon within the seasonal cycle.
Returned lags
-------------
The returned lag list always contains:
- ``l``:
the lag corresponding to the aligned seasonal position
In most cases, it additionally contains:
- ``l - 1``:
the observation immediately preceding the aligned seasonal position
Including both lags helps the model capture short-term local dynamics around the seasonal reference point,
rather than relying on a single aligned observation.
Example
-------
.. mermaid::
timeline
title Seasonal alignment example for h=3 and s=24
section Model lags
t-25
t-24 : seasonal anchor for s=24
t-23
t-22 : preceding point (second Darts lag)
section Δ24h seasonal offset
t-21 : aligned seasonal point for t+3 (first Darts lag)
... t+l ...
t-1
t : prediction origin (belief time)
t+1
t+2
section Forecast horizons
t+3 : forecast target at h=3 (event start)
t+4
... t+H : max forecast horizon
For:
horizon = 3
seasonal_lag_steps = 24
we obtain:
l = -(24 - (3 % 24))
= -21
yielding:
[-21, -22]
corresponding to:
t-21 : aligned seasonal position for target t+3
t-22 : observation immediately preceding it
Edge case near maximum forecast horizon
---------------------------------------
For horizons near the maximum forecast horizon, only ``l`` is returned.
This avoids generating additional lag references that are not guaranteed to exist consistently during recursive multi-horizon prediction.
"""
offset = horizon % seasonal_lag_steps
aligned_darts_lag = -(seasonal_lag_steps - offset)
# The preceding lag is omitted for the final forecast horizon,
# because it is not guaranteed to exist consistently during recursive inference.
if horizon != max_forecast_horizon - 1:
darts_lags = [aligned_darts_lag, aligned_darts_lag - 1]
else:
darts_lags = [aligned_darts_lag]
return darts_lags

This is what the included mermaid timeline looks like:

timeline
    title Seasonal alignment example for h=3 and s=24
    
    section  Model lags
        t-25
        t-24 : seasonal anchor for s=24
        t-23
        t-22 : preceding point (second Darts lag)
    section Δ24h seasonal offset
        t-21 : aligned seasonal point for t+3 (first Darts lag)
        ... t+l ...
        t-1
        t : prediction origin (belief time)
        t+1
        t+2
    section Forecast horizons
        t+3 : forecast target at h=3 (event start)
        t+4
        ... t+H : max forecast horizon
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Forecast degradation for longer horizons despite homogeneous input data

3 participants