Skip to content

fix(backend): ground proactive insights and memory extraction in the current date#8501

Open
ZachL111 wants to merge 11 commits into
BasedHardware:mainfrom
ZachL111:zach/insight-date-grounding
Open

fix(backend): ground proactive insights and memory extraction in the current date#8501
ZachL111 wants to merge 11 commits into
BasedHardware:mainfrom
ZachL111:zach/insight-date-grounding

Conversation

@ZachL111

@ZachL111 ZachL111 commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Summary

Proactive insight notifications and extracted memories sometimes claim that a user's correctly dated content is in the future or that their system clock is wrong, for example "Your system clock is logging as 2026, not 2024. Check your OS date/time settings." or "The schedule is for 5/21/2026. This date is two years in the future." The user's clock is correct. The generators never told the model what today is, so it fell back to its training-cutoff year and treated every real future-year date as an anomaly. This injects the current date into those prompts.

Root cause

The chat and conversation-structuring prompts already ground the model in the present (utils/llm/chat.py and conversation_processing.py inject "Current date time in {tz}" and "today is {date}"). The proactive insight and memory generators did not:

  • utils/llm/proactive_notification.py: GATE_PROMPT, GENERATE_PROMPT, CRITIC_PROMPT, and the legacy PROACTIVE_PROMPT_TEMPLATE ask the model to reason about deadlines and whether something is "tomorrow" or upcoming, with no current date in the prompt. The proactive template even instructs the model to cite dates and judge "that's tomorrow".
  • utils/prompts.py: extract_memories_prompt and extract_memories_text_content_prompt extract facts from the user's content with no current date.

With nothing establishing "now", the model uses its own sense of the year (around its training cutoff), so a correctly recorded 2026 date reads as two years in the future.

Fix

  • Added utils/llm/temporal.py with current_date_in_tz(tz) and current_date_for_uid(uid): the current date as YYYY-MM-DD in the user's timezone, UTC fallback on a missing or failed timezone lookup.
  • Injected a "Today is {current_date}" line into all four proactive prompts, telling the model that dates in the current year or later are normal and not a clock error. The critic prompt goes further and is told to REJECT any notification that claims a correctly stated date is in the future or that the clock is wrong, so the bad notification class is blocked at the last gate.
  • Threaded the user's date through the proactive pipeline in utils/app_integrations.py (computed once per run from the user's timezone, passed to the gate, generate, and critic calls). The new parameter is optional and defaults to the current UTC date, so the prompt is always grounded even if a caller omits it.
  • Added the same grounding to the two memory-extraction prompts, and passed the current date through new_memories_extractor and extract_memories_from_text.

Additive only: every new parameter is optional, and the date lives in the dynamic part of the prompt, so the static prefix used for caching is unchanged.

Testing

  • New tests/unit/test_insight_date_grounding.py (12 tests): the date helper (timezone formatting, invalid or missing timezone fallback to UTC, per-uid lookup with failure fallback), the four proactive prompts each carrying the current date (including the critic's reject instruction and the UTC fallback when no date is passed), and both memory-extraction templates now requiring a current_date variable. Registered in test.sh.
  • Proven red against the pre-change version (7 of 12 fail without the change), green after.
  • tests/unit/test_mentor_notifications.py and tests/unit/test_proactive_notification_language.py still pass (47 tests), confirming no regression to the proactive pipeline.

Review in cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 7 files

Confidence score: 3/5

  • In backend/utils/llm/memories.py, grounding relative time expressions against processing time instead of source-content time can misresolve phrases like “tomorrow” for delayed/retried/backfilled items, producing incorrect stored memories and downstream behavior—anchor extraction to the content/event timestamp (or pass an explicit reference datetime) before merging.
  • In backend/utils/llm/temporal.py, the module-level dependency on database.notifications introduces DB-layer side effects into a low-level temporal helper, which can trigger unwanted Firestore client initialization and make non-DB usage/tests brittle—decouple this import (e.g., move to call sites or inject the formatter dependency) before merging.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread backend/utils/llm/memories.py Outdated
Comment thread backend/utils/llm/temporal.py Outdated
@ZachL111

Copy link
Copy Markdown
Contributor Author

Both findings were valid and are now fixed.

P2, temporal.py coupling: moved the database.notifications import inside current_date_for_uid, so current_date_in_tz and the new date_in_tz import no DB layer and a module that uses only those does not initialize the Firestore client at import time. A timezone lookup failure inside current_date_for_uid still falls back to UTC.

P2, memory extraction date: new_memories_extractor and extract_memories_from_text now take an optional content_date and use it when provided, falling back to current_date_for_uid only when it is absent. process_conversation._extract_memories_inner passes the conversation's capture date (date_in_tz(conversation.started_at, user_tz)), so relative dates in delayed or backfilled content resolve against when the content happened rather than the processing time. Added a date_in_tz helper plus tests for the helper and the content-date fallback.

@kodjima33 kodjima33 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend: ground proactive insights/memory extraction in current date — broad prompt change, approve only (>300 lines, leave merge for maintainer).

@Git-on-my-level

Copy link
Copy Markdown
Collaborator

Thanks @ZachL111 — this is a clean, well-scoped fix for a real bug. The root-cause analysis is spot on: the proactive insight and memory-extraction prompts asked the model to reason about dates and deadlines without ever anchoring "today," so it fell back to its training-cutoff year and produced hallucinated clock/calendar warnings. Grounding them in the current date is the right fix and matches the pattern already used in the chat pipeline.

A few things that stand out as solid:

  • temporal.py keeps the pure helpers (current_date_in_tz / date_in_tz) free of any DB import, with only current_date_for_uid touching Firestore lazily and with a UTC fallback — so importing the module never initializes the client. This addresses cubic's coupling note.
  • Threading content_date (the conversation's capture date) through the memory extractors rather than always using processing time correctly handles delayed/backfilled content — this addresses cubic's second note. Good catch responding to both.
  • New params are optional everywhere with sensible UTC fallbacks, so this is purely additive and won't break existing callers.
  • The critic prompt's REJECT instruction is carefully worded to block wrong clock-error notifications without suppressing legitimate future reminders.

The 12 new tests (helper behavior, per-prompt grounding, content_date fallback) and the red/green confirmation are appreciated.

Leaving the merge call to a human maintainer given the breadth of the prompt change across several user-facing generators, but this looks good to land.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants