You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Definition:
Retrieve last 180 days of sent mail via MS Graph delta query, apply PII masking (Presidio), embed using text-embedding-ada-002, and index into Azure AI Search (or ChromaDB fallback). Creates foundation for RAG precedent retrieval.
Note
Real GPT-4o text-embedding-ada-002, Graph API sent-email retrieval, and batch processing are fully wired up with deterministic fallback support.
Acceptance Criteria:
Graph delta query retrieves ≥50 sent emails
PII masking applied to body before embedding (names, emails, phone numbers, cards)
Embeddings generated using ada-002 model
Index created with schema: {email_id, sender, timestamp, subject, masked_body, embedding}
Index searchable by subject + masked_body (Searchable by cosine similarity matching)
Batch process: 50 emails at a time
Indexing completes in <5 minutes for typical dataset
Task: RAG-01 | PII-Mask + Embed + Index
Estimated Hours: 3h | Owner: MK | Priority: P0 | Status: Completed
Definition:
Retrieve last 180 days of sent mail via MS Graph delta query, apply PII masking (Presidio), embed using text-embedding-ada-002, and index into Azure AI Search (or ChromaDB fallback). Creates foundation for RAG precedent retrieval.
Note
Real GPT-4o text-embedding-ada-002, Graph API sent-email retrieval, and batch processing are fully wired up with deterministic fallback support.
Acceptance Criteria:
{email_id, sender, timestamp, subject, masked_body, embedding}