Skip to content

Commit 0f831e2

Browse files
jackwildmangithub-actions[bot]
authored andcommitted
feat: add date forecast type for timing questions (#5247)
## Summary Adds a third forecast mode (`date`) alongside `binary` and `numeric`. Date forecasts produce YYYY-MM-DD percentile estimates (p10–p90) for "when will X happen?" questions, with prompts emphasizing delay bias and status-quo anchoring. Output columns follow the same `{output_field}_p{N}` naming as numeric, so `output_field` is required and `units` is ignored. ## Changes - **Engine** (`forecast.py`, `task_spec.py`, `agent_state.py`, `operations.py`) - New `DATE_FORECASTER_PROMPT` with delay bias / never-happen sentinel (`2099-12-31`) guidance - New `build_date_response_schema()` returning string-typed percentile fields - New `_combine_batched_date_results()` aggregating with **median ordinals** (robust to the 2099 sentinel) instead of mean - `forecast_type` literal extended to `"binary" | "numeric" | "date"` across `DeepForecastPublicParams`, `DeepForecastFullParams`, `ForecastBatchStateSerializable`, and `ForecastOperation` - Date validation requires `output_field` and ignores `units` - **OpenAPI / SDK / MCP** - Regenerated OpenAPI types - `forecast()` / `forecast_async()` SDK signatures and docstrings updated - MCP `ForecastInput` and `tools.py` mode_label updated - **everyrow-cc frontend** - `ColumnInfo.forecastType` extended; date branch added in `extractColumnInfo` - New `extractDatePercentiles()` and `DatePercentileRangeBar` (timestamp-based scaling, compact "Jun '25" labels) - `ResearcherStreamItem`, `ResearcherDetailPanel`, `ResearcherStreamView` rendering branches added - **everyrow-cc agent** - System prompt updated to mention `forecast_type="date"` for timing questions ## Design Notes - **Median over mean** for date aggregation: dates aren't continuous in the same way numerics are, and median gracefully handles the 2099-12-31 "never happens" sentinel that some forecasters may emit. - **Schema**: percentile fields use `{"type": "string"}`, mapping to `Nullable(String)` in ClickHouse via the existing `_JSON_TO_CH` mapping. No CH schema changes needed. - **Backward compatible**: all changes are additive — extending `Literal` unions and adding `elif` branches. Binary/numeric forecasts are unaffected. ## Test plan - [ ] `cd cohort/engine && uv run pyright src` (passes locally — 0 errors) - [ ] `cd cohort/engine && uv run ruff check` (passes locally) - [ ] `cd cohort/everyrow-cc/frontend && pnpm run tsc` (passes locally) - [ ] `cd cohort/everyrow-cc/frontend && pnpm run lint` (passes locally) - [ ] Run a date forecast end-to-end via the SDK and verify percentile output + visualization - [ ] Verify a numeric/binary forecast still works (regression check) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Sourced from commit a46ce2370326ec99497b3e869021b3ce4d83068d
1 parent 5b47d70 commit 0f831e2

4 files changed

Lines changed: 31 additions & 14 deletions

File tree

futuresearch-mcp/src/futuresearch_mcp/models.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -413,10 +413,12 @@ class ForecastInput(_SingleSourceInput):
413413
"(e.g. 'Focus on EU regulatory sources' or 'Assume resolution by end of 2027'). "
414414
"Leave empty when the rows are self-contained.",
415415
)
416-
forecast_type: Literal["binary", "numeric"] = Field(
416+
forecast_type: Literal["binary", "numeric", "date"] = Field(
417417
description="Type of forecast. 'binary': yes/no probability (0-100) for questions like "
418418
"'Will X happen?'. 'numeric': percentile estimates (p10-p90) for questions like "
419-
"'What will the price/value/count be?'. Requires output_field when 'numeric'.",
419+
"'What will the price/value/count be?'. 'date': date percentile estimates (p10-p90) "
420+
"as YYYY-MM-DD strings for timing questions like 'When will X happen?'. "
421+
"Requires output_field when 'numeric' or 'date'.",
420422
)
421423
output_field: str | None = Field(
422424
default=None,

futuresearch-mcp/src/futuresearch_mcp/tools.py

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -647,7 +647,7 @@ async def futuresearch_forecast(
647647
) -> list[TextContent]:
648648
"""Forecast questions about the future using deep research and multi-model ensemble.
649649
650-
Supports two modes:
650+
Supports three modes:
651651
652652
- **binary** (default): Forecasts probability (0-100) for YES/NO questions.
653653
Output columns: ``probability`` (int, 0-100) and ``rationale`` (str).
@@ -657,6 +657,11 @@ async def futuresearch_forecast(
657657
Output columns: ``{output_field}_p10`` through ``{output_field}_p90`` (float),
658658
``units`` (str), and ``rationale`` (str).
659659
660+
- **date**: Forecasts date percentile estimates for timing questions.
661+
Requires ``output_field`` (e.g. ``"launch_date"``).
662+
Output columns: ``{output_field}_p10`` through ``{output_field}_p90``
663+
(YYYY-MM-DD strings) and ``rationale`` (str).
664+
660665
The CSV should contain at minimum a ``question`` column. Recommended additional
661666
columns: ``resolution_criteria``, ``resolution_date``, ``background``. All
662667
columns are passed to the research agents and forecasters.
@@ -695,9 +700,12 @@ async def futuresearch_forecast(
695700
task_id = str(cohort_task.task_id)
696701
total = len(input_data) if isinstance(input_data, pd.DataFrame) else 0
697702

698-
mode_label = (
699-
"numeric percentile" if params.forecast_type == "numeric" else "probability"
700-
)
703+
if params.forecast_type == "date":
704+
mode_label = "date"
705+
elif params.forecast_type == "numeric":
706+
mode_label = "numeric percentile"
707+
else:
708+
mode_label = "probability"
701709
return await create_tool_response(
702710
task_id=task_id,
703711
label=f"Submitted: {total} rows for {mode_label} forecasting (6 research dimensions + 3 forecasters per batch)."

src/futuresearch/generated/models/forecast_operation_forecast_type.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
class ForecastOperationForecastType(str, Enum):
55
BINARY = "binary"
66
NUMERIC = "numeric"
7+
DATE = "date"
78

89
def __str__(self) -> str:
910
return str(self.value)

src/futuresearch/ops.py

Lines changed: 14 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -818,13 +818,13 @@ async def forecast(
818818
context: str | None = None,
819819
session: Session | None = None,
820820
*,
821-
forecast_type: Literal["binary", "numeric"],
821+
forecast_type: Literal["binary", "numeric", "date"],
822822
output_field: str | None = None,
823823
units: str | None = None,
824824
) -> TableResult:
825825
"""Forecast questions using deep research and multi-model ensemble.
826826
827-
Supports two modes:
827+
Supports three modes:
828828
829829
- **binary** (default): Forecasts the probability (0-100) of YES/NO questions.
830830
Output columns: ``probability`` (int) and ``rationale`` (str).
@@ -834,6 +834,11 @@ async def forecast(
834834
Output columns: ``{output_field}_p10`` through ``{output_field}_p90`` (float),
835835
``units`` (str), and ``rationale`` (str).
836836
837+
- **date**: Forecasts percentile date estimates for timing questions.
838+
Requires ``output_field`` (e.g. ``"launch_date"``).
839+
Output columns: ``{output_field}_p10`` through ``{output_field}_p90``
840+
(YYYY-MM-DD strings) and ``rationale`` (str).
841+
837842
Each row is forecast using 6 parallel research agents followed by a 3-model
838843
forecaster ensemble, validated against FutureSearch's past-casting environment.
839844
@@ -848,9 +853,9 @@ async def forecast(
848853
end of 2027"). Leave *None* when the rows are self-contained.
849854
session: Optional session. If not provided, one will be created automatically.
850855
forecast_type: ``"binary"`` for probability forecasts, ``"numeric"`` for
851-
percentile estimates.
852-
output_field: Name of the quantity being forecast (required for numeric,
853-
e.g. ``"price"``, ``"count"``).
856+
percentile estimates, ``"date"`` for date percentile estimates.
857+
output_field: Name of the quantity being forecast (required for numeric
858+
and date, e.g. ``"price"``, ``"launch_date"``).
854859
units: Units for numeric forecasts (e.g. ``"USD per barrel"``).
855860
Required when *forecast_type* is ``"numeric"``.
856861
@@ -890,7 +895,7 @@ async def forecast_async(
890895
task: str,
891896
session: Session,
892897
input: DataFrame | UUID | TableResult,
893-
forecast_type: Literal["binary", "numeric"],
898+
forecast_type: Literal["binary", "numeric", "date"],
894899
output_field: str | None = None,
895900
units: str | None = None,
896901
) -> EveryrowTask[BaseModel]:
@@ -900,8 +905,9 @@ async def forecast_async(
900905
task: Context or instructions for the forecast.
901906
session: Active session.
902907
input: Input data.
903-
forecast_type: ``"binary"`` for yes/no probability, ``"numeric"`` for percentile estimates.
904-
output_field: Name of the numeric quantity (required for numeric).
908+
forecast_type: ``"binary"`` for yes/no probability, ``"numeric"`` for
909+
percentile estimates, ``"date"`` for date percentile estimates.
910+
output_field: Name of the quantity (required for numeric and date).
905911
units: Units for numeric forecasts (required for numeric).
906912
907913
Returns:

0 commit comments

Comments
 (0)