feat(mcp): ChatGPT/Claude connector search and fetch over the memory bank (#4862)#8492
feat(mcp): ChatGPT/Claude connector search and fetch over the memory bank (#4862)#8492ZachL111 wants to merge 9 commits into
Conversation
There was a problem hiding this comment.
5 issues found across 4 files
Confidence score: 2/5
- In
backend/routers/mcp.py, the search endpoint puts sensitive query text in GET URL parameters, which can bypass existing PII log sanitization and leak user data via access logs/history; this is the highest-risk issue in the PR — move the payload to a sanitized POST body (or equivalent) before merging. - In
_search_conversationsinbackend/utils/mcp_search.py, using onlystructured.overviewfor resulttextcan return empty previews even when transcripts exist, so users may see blank/low-quality search results whilefetchreturns content — add transcript fallback logic for preview text before merging. - Also in
_search_conversations(backend/utils/mcp_search.py), not over-fetching candidates when top-k items are locked can under-fill returned results and make search appear incomplete — over-fetch and then filter (as done for memories) to keep result counts stable. - In ID parsing paths in
backend/utils/mcp_search.py, prefixed-but-empty IDs likememory:are treated as missing items (404) instead of invalid requests (400), andbackend/tests/unit/test_mcp_data_endpoints.pydoes not exercise rejected-item filtering as named, increasing behavior drift risk — add explicit empty-ID validation and extend the test to cover rejected filtering before merging.
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
|
Thanks cubic, all five were valid and are fixed.
|
|
Thanks for this, @ZachL111 — really clean work and squarely on-mission for #4862 (making Omi's memory bank directly consumable by ChatGPT/Claude via the standard MCP I did a static pass over the full diff against Verified good
One consistency gap worth hardening (low severity) Status |
|
Thanks David, good catch on the search/fetch disagreement. |
|
Thanks @ZachL111 — confirmed the Static pass against
This is squarely on-mission for #4862 and well-tested. I'm not approving it myself, though: it adds a new full-text |
|
Thanks David, that matches the intent. The new fetch path is read only and every DB call is uid scoped, so it adds reach over the caller's own memories and conversations but no new cross user surface, which is why I kept the connector stricter than REST (drops locked conversations in search, paywalls them in fetch). I agree the data access direction deserves a maintainer plus security sign off rather than just a code pass, so the security-review and needs-maintainer-review labels make sense to me. If a maintainer would rather split memories.read and conversations.read into two separate connector scopes, or gate the combined scope behind an explicit opt in, I am happy to make that change in this PR. Leaving it as is for now per your note that the consistency fix can stay. |
kodjima33
left a comment
There was a problem hiding this comment.
MCP feature: connector search/fetch over memory base — approve only.
Summary
Advances #4862 (Omi needs to become an easily integratable memory bank for chatgpt/claude), in the "Great, sharable product" milestone.
Claude already speaks MCP, so the Omi MCP server's existing tools work there directly. ChatGPT is the gap: OpenAI's ChatGPT connectors and deep-research models consume an MCP server through a standard pair of read-only tools,
searchandfetch, with a fixed result shape. Omi only exposed domain-specific search tools (search_memories,search_conversations), so it was not directly usable as a ChatGPT connector. This adds the standard contract over the user's memory bank.What this adds
Two new MCP tools following OpenAI's connector contract:
search(query): unified search across the user's memories and conversations (the two richest text sources), returning{results: [{id, title, url, text}]}. Result ids are namespaced (memory:<id>/conversation:<id>).fetch(id): returns the full document for a result id as{id, title, text, url, metadata}.Both are read-only. The
tools/callresponse echoessearchandfetchresults asstructuredContentalongside the JSON text, which the connector contract expects. Matching REST endpoints (GET /v1/mcp/search,GET /v1/mcp/fetch) are added for parity.Design
utils/mcp_search.pyis the single orchestration both transports call. It reuses the existing vector search (find_similar_memories,query_vectors), applies the same safety filterssearch_memoriesalready uses (never surfaces rejected, locked, or superseded facts, and skips locked conversations), interleaves the two sources so both are represented, and namespaces ids sofetchknows which store to read.fetchmaps a missing item to 404 / -32001 and a locked item to 402 / -32002, and validates the id and query types so a non-string argument returns a clean error instead of an uncaught one.h.omi.meweb host (for examplehttps://h.omi.me/conversations/<id>).Combined with the read and write tools already on the server (memories, conversations, action items, goals, people, folders, screen activity, daily summaries), this makes the Omi MCP server a drop-in memory bank for both Claude (native MCP) and ChatGPT (search/fetch connector).
Testing
Added 14 tests to
tests/unit/test_mcp_data_endpoints.py(already wired intotest.sh): search merges and ranks both sources, the safety filters drop locked and rejected items, fetch returns the right shape for memories and conversations, not-found is 404 / -32001, locked is 402 / -32002, bad and non-string ids are rejected, the tools are registered, andtools/callemitsstructuredContentfor search. The full file passes (31 tests), and the adjacent MCP tests still pass. Black and the async-blocker lint are clean.