diff --git a/docs/tutorial-prompt-agent-quickstart.md b/docs/tutorial-prompt-agent-quickstart.md
index fa0f4c9..2ee3769 100644
--- a/docs/tutorial-prompt-agent-quickstart.md
+++ b/docs/tutorial-prompt-agent-quickstart.md
@@ -809,14 +809,28 @@ the smoke run above proves the workspace works. The next commands only harden
 the same gate with multi-turn rows that can later line up with trace replay and
 trace-to-dataset evidence.
 
-Create a small conversation-shaped dataset. It still keeps `input` and
-`expected` so AgentOps and azd can route the row, but it also carries the
-conversation turns that multi-turn evaluators and trace-derived rows use:
+Create a small set of **synthetic multi-turn test cases**. These rows are not
+claiming that the agent already said the assistant turns verbatim. They define a
+controlled conversation scenario you want the next response to handle.
+
+Keep the important conversation context inside `input`, because that is the
+field AgentOps maps to the azd `query`. Also keep `messages` beside it so the
+dataset has the same shape as future trace-derived rows and release evidence can
+show that this gate covers conversation scenarios.
+
+> **What about full multi-turn evaluation?** Foundry also supports
+> **Full conversations** evaluation in preview from the portal: it evaluates a
+> complete multi-turn conversation from start to finish, including overall
+> conversation quality, task completion, and user satisfaction. This tutorial's
+> CLI / azd flow is intentionally simpler: it uses synthetic conversation-context
+> rows where the agent receives the relevant conversation summary in `input`, and
+> `messages` preserves the structured scenario for evidence and future
+> trace-derived regression.
 
 ```powershell
 @'
-{"input":"Plan a three-day Rome trip for a family with kids. Ask one clarification if needed.","expected":"The agent should preserve the family-with-kids constraint, propose a practical three-day Rome itinerary, include transit/rest pacing, and avoid claiming it can book live reservations.","messages":[{"role":"user","content":"We want to visit Rome with two kids."},{"role":"assistant","content":"How many days do you have and what pace do you prefer?"},{"role":"user","content":"Three days, moderate pace, museums and food."}]}
-{"input":"Help me choose between Lisbon and Seattle for a low-budget food weekend.","expected":"The agent should compare both destinations, mention budget tradeoffs, food activities, transit/weather notes, and avoid unsupported price or booking claims.","messages":[{"role":"user","content":"I need a low-budget food weekend."},{"role":"assistant","content":"Are you choosing between specific cities?"},{"role":"user","content":"Lisbon or Seattle."}]}
+{"input":"Conversation so far: the user wants to visit Rome with two kids. The assistant asked how many days and what pace they prefer. The user answered: three days, moderate pace, museums and food. Now plan the trip.","expected":"The agent should preserve the family-with-kids constraint, propose a practical three-day Rome itinerary, include transit/rest pacing, and avoid claiming it can book live reservations.","messages":[{"role":"user","content":"We want to visit Rome with two kids."},{"role":"assistant","content":"How many days do you have and what pace do you prefer?"},{"role":"user","content":"Three days, moderate pace, museums and food."}]}
+{"input":"Conversation so far: the user needs a low-budget food weekend. The assistant asked whether they are choosing between specific cities. The user answered: Lisbon or Seattle. Now compare those options.","expected":"The agent should compare both destinations, mention budget tradeoffs, food activities, transit/weather notes, and avoid unsupported price or booking claims.","messages":[{"role":"user","content":"I need a low-budget food weekend."},{"role":"assistant","content":"Are you choosing between specific cities?"},{"role":"user","content":"Lisbon or Seattle."}]}
 '@ | Set-Content -Encoding utf8 .agentops\data\travel-conversations.jsonl
 ```