Skip to content

Commit d25f464

Browse files
authored
Merge pull request #4 from superdoc-dev/caio/rename-ooxml-mcp
feat(mcp)!: rename to ooxml convention; refresh public surface
2 parents fa68046 + 7a40093 commit d25f464

19 files changed

Lines changed: 353 additions & 150 deletions

File tree

CLAUDE.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -104,21 +104,21 @@ The XML you provide is wrapped in a minimal `w:document > w:body` structure auto
104104

105105
## MCP Server
106106

107-
Cloudflare Worker exposing two flavors of MCP tools backed by the same database.
107+
Cloudflare Worker exposing two tool families over MCP, backed by the same database.
108108

109-
Semantic search over the spec PDF (powered by `spec_content`):
109+
Prose search over the spec PDFs (powered by `spec_content`):
110110

111-
- `search_ecma_spec` - semantic vector search across 18,000+ spec chunks
112-
- `get_section` - fetch a specific section by ID (e.g., "17.3.1.24")
113-
- `list_parts` - browse the spec structure
111+
- `ooxml_search` - semantic vector search across 18,000+ spec chunks
112+
- `ooxml_section` - fetch a specific section by ID (e.g., "17.3.1.24")
113+
- `ooxml_parts` - browse the spec structure
114114

115115
Structural queries over the XSD schema graph (powered by `xsd_*` tables):
116116

117-
- `ooxml_lookup_element` / `ooxml_lookup_type` - canonical symbol info
117+
- `ooxml_element` / `ooxml_type` - canonical symbol info
118118
- `ooxml_children` - legal children of an element/type/group, in document order
119119
- `ooxml_attributes` - attributes including those inherited and unfolded from attributeGroup refs
120120
- `ooxml_enum` - simpleType enumeration values
121-
- `ooxml_namespace_info` - vocabularies and per-profile symbol counts for a namespace URI
121+
- `ooxml_namespace` - vocabularies and per-profile symbol counts for a namespace URI
122122

123123
Uses PostgreSQL with pgvector (Neon serverless in production, Docker locally).
124124

README.md

Lines changed: 32 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
<img width="300" alt="logo" src="https://github.com/user-attachments/assets/df6311a6-c050-4592-bbf1-4a2228655bc3" />
22

3-
[![Web](https://img.shields.io/badge/Web-v0.1.3-blue)](https://ooxml.dev)
4-
[![MCP Server](https://img.shields.io/badge/MCP_Server-v0.0.1-blue)](https://api.ooxml.dev/mcp)
3+
[![Web](https://img.shields.io/github/v/tag/superdoc-dev/ooxml-dev?filter=web-v*&label=Web&color=blue)](https://ooxml.dev)
4+
[![MCP Server](https://img.shields.io/github/v/tag/superdoc-dev/ooxml-dev?filter=mcp-v*&label=MCP%20Server&color=blue)](https://api.ooxml.dev/mcp)
55
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
66

77
The OOXML spec, explained by people who actually implemented it.
@@ -23,16 +23,41 @@ We faced this at SuperDoc — building a document engine on native OOXML with no
2323

2424
## MCP Server
2525

26-
Ask questions in natural language and get answers grounded in the spec, or query the schema graph for precise structural answers.
26+
Ask questions in natural language and get answers grounded in the spec, or query the schema graph for precise structural answers. Works with Claude Code, Codex CLI, Cursor, and any MCP-compatible client.
27+
28+
**Claude Code**
29+
30+
```bash
31+
claude mcp add --transport http ooxml https://api.ooxml.dev/mcp
32+
```
33+
34+
**Codex CLI**
2735

2836
```bash
29-
claude mcp add --transport http ecma-spec https://api.ooxml.dev/mcp
37+
codex mcp add ooxml --url https://api.ooxml.dev/mcp
38+
```
39+
40+
Or in `~/.codex/config.toml`:
41+
42+
```toml
43+
[mcp_servers.ooxml]
44+
url = "https://api.ooxml.dev/mcp"
45+
```
46+
47+
**Cursor** — add to your MCP settings:
48+
49+
```json
50+
{
51+
"mcpServers": {
52+
"ooxml": { "url": "https://api.ooxml.dev/mcp" }
53+
}
54+
}
3055
```
3156

32-
Works with Claude Code, Cursor, and any MCP-compatible client. Two flavors of tools share one server:
57+
Two tool families share one server:
3358

34-
- **Semantic** (over the spec PDF): `search_ecma_spec`, `get_section`, `list_parts`
35-
- **Structural** (over the parsed XSDs): `ooxml_lookup_element`, `ooxml_lookup_type`, `ooxml_children`, `ooxml_attributes`, `ooxml_enum`, `ooxml_namespace_info`
59+
- **Prose search** (over the spec PDFs): `ooxml_search`, `ooxml_section`, `ooxml_parts`
60+
- **Schema lookup** (over the parsed XSDs): `ooxml_element`, `ooxml_type`, `ooxml_children`, `ooxml_attributes`, `ooxml_enum`, `ooxml_namespace`
3661

3762
## Development
3863

apps/mcp-server/README.md

Lines changed: 69 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,85 @@
1-
# ECMA-376 Spec MCP Server
1+
# OOXML Reference MCP Server
22

3-
**The world's first ECMA-376 MCP server** - semantic search across the entire Office Open XML specification.
3+
Cloudflare Worker that exposes ECMA-376 (Office Open XML) over the Model Context Protocol. Two tool families share one server:
44

5-
- 18,000+ chunks from all 4 parts of ECMA-376
6-
- Vector search powered by Voyage embeddings + pgvector
7-
- Hosted on Cloudflare Workers
5+
- **Prose search** — semantic search across the four ECMA-376 part PDFs (~18,000 chunks, embedded with Voyage, queried with pgvector).
6+
- **Schema lookup** — deterministic queries over the parsed XSD graph (profiles, namespaces, symbols, content models, attributes, enums).
87

9-
## Connect in Claude Code
8+
Hosted at `https://api.ooxml.dev/mcp`.
9+
10+
## Connect
11+
12+
### Claude Code
13+
14+
```bash
15+
claude mcp add --transport http ooxml https://api.ooxml.dev/mcp
16+
```
17+
18+
### Codex CLI
1019

1120
```bash
12-
claude mcp add --transport http ecma-spec https://api.ooxml.dev/mcp
21+
codex mcp add ooxml --url https://api.ooxml.dev/mcp
22+
```
23+
24+
Or add to `~/.codex/config.toml`:
25+
26+
```toml
27+
[mcp_servers.ooxml]
28+
url = "https://api.ooxml.dev/mcp"
29+
```
30+
31+
### Cursor
32+
33+
Add to your Cursor MCP settings:
34+
35+
```json
36+
{
37+
"mcpServers": {
38+
"ooxml": {
39+
"url": "https://api.ooxml.dev/mcp"
40+
}
41+
}
42+
}
1343
```
1444

15-
## Endpoints
45+
### Other clients
46+
47+
Any MCP-compatible client that speaks Streamable HTTP can connect to the endpoint directly.
48+
49+
## Tools
1650

17-
| Endpoint | Method | Description |
18-
|----------|--------|-------------|
19-
| `/mcp` | GET | MCP server info |
20-
| `/search` | POST | Semantic search (`{query, part?, limit?}`) |
21-
| `/section` | GET | Get section (`?id=17.3.2&part=1`) |
22-
| `/stats` | GET | Database stats |
51+
### Prose search
52+
53+
| Tool | Returns |
54+
| --- | --- |
55+
| `ooxml_search` | Semantic search over the spec PDFs |
56+
| `ooxml_section` | Specific section by ID (e.g. `17.3.2`) |
57+
| `ooxml_parts` | Spec part / section structure |
58+
59+
### Schema lookup
60+
61+
| Tool | Returns |
62+
| --- | --- |
63+
| `ooxml_element` | Canonical info for an element by qname |
64+
| `ooxml_type` | Canonical info for a complexType or simpleType |
65+
| `ooxml_children` | Legal children of an element, type, or group (walks inheritance) |
66+
| `ooxml_attributes` | Attributes including inherited + attributeGroup refs |
67+
| `ooxml_enum` | Enumeration values for a simpleType |
68+
| `ooxml_namespace` | Vocabularies and per-profile symbol counts for a namespace URI |
69+
70+
Default profile is `transitional`. Future profiles will compose Transitional with Office extension schemas.
2371

2472
## Development
2573

2674
```bash
27-
# Install
75+
# Install (from repo root)
2876
bun install
2977

30-
# Run locally (needs .dev.vars with DATABASE_URL, VOYAGE_API_KEY)
31-
wrangler dev
78+
# Local dev — needs .dev.vars with DATABASE_URL and VOYAGE_API_KEY
79+
bun run dev:mcp
3280

33-
# Deploy
34-
wrangler deploy
81+
# Deploy (from this directory)
82+
bun run deploy
3583
```
84+
85+
Database setup, ingest pipelines, and tests live at the repo root — see the top-level `README.md`.

apps/mcp-server/src/index.ts

Lines changed: 13 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,16 @@
11
/**
2-
* ECMA-376 Spec MCP Server
2+
* OOXML Reference MCP Server
33
*
4-
* Cloudflare Worker that exposes ECMA-376 specification search via MCP protocol.
5-
*
6-
* Tools:
7-
* - search_ecma_spec: Semantic search across the spec
8-
* - get_section: Get specific section by ID
9-
* - list_parts: List spec parts and sections
4+
* Cloudflare Worker exposing two tool families over MCP:
5+
* - prose search over ECMA-376 PDFs (ooxml_search, ooxml_section, ooxml_parts)
6+
* - schema lookup over the parsed XSD graph (ooxml_element, ooxml_type,
7+
* ooxml_children, ooxml_attributes, ooxml_enum, ooxml_namespace)
108
*/
119

1210
import { createDb } from "./db";
1311
import { embedQuery } from "./embeddings";
14-
import { handleMcpRequest } from "./mcp";
12+
import { handleMcpRequest, TOOLS } from "./mcp";
13+
import { OOXML_TOOL_DEFS } from "./ooxml-tools";
1514

1615
export interface Env {
1716
DATABASE_URL: string;
@@ -169,7 +168,7 @@ export default {
169168
return addCorsHeaders(
170169
new Response(
171170
JSON.stringify({
172-
name: "ECMA-376 Spec MCP Server",
171+
name: "OOXML Reference MCP Server",
173172
version: "0.1.0",
174173
endpoints: {
175174
mcp: "/mcp",
@@ -188,50 +187,15 @@ export default {
188187
},
189188
};
190189

191-
// MCP info endpoint (GET for debugging)
190+
// MCP info endpoint (GET for debugging). Tool list is derived from the same
191+
// canonical exports as the JSON-RPC tools/list response so they can't drift.
192192
function handleMcpInfo(): Response {
193193
return new Response(
194194
JSON.stringify({
195-
name: "ecma-spec",
195+
name: "ooxml",
196196
version: "0.1.0",
197-
description: "ECMA-376 (Office Open XML) specification search server",
198-
tools: [
199-
{
200-
name: "search_ecma_spec",
201-
description: "Search the ECMA-376 specification semantically",
202-
inputSchema: {
203-
type: "object",
204-
properties: {
205-
query: { type: "string", description: "Natural language search query" },
206-
part: { type: "number", description: "Filter by part number (1-4)" },
207-
limit: { type: "number", description: "Max results (default: 5)" },
208-
},
209-
required: ["query"],
210-
},
211-
},
212-
{
213-
name: "get_section",
214-
description: "Get a specific section by ID",
215-
inputSchema: {
216-
type: "object",
217-
properties: {
218-
section_id: { type: "string", description: "Section ID (e.g., '17.3.2')" },
219-
part: { type: "number", description: "Part number (1-4)" },
220-
},
221-
required: ["section_id"],
222-
},
223-
},
224-
{
225-
name: "list_parts",
226-
description: "List spec parts and sections",
227-
inputSchema: {
228-
type: "object",
229-
properties: {
230-
part: { type: "number", description: "Filter by part number (1-4)" },
231-
},
232-
},
233-
},
234-
],
197+
description: "OOXML (ECMA-376) reference server: prose search + schema lookup",
198+
tools: [...TOOLS, ...OOXML_TOOL_DEFS],
235199
}),
236200
{
237201
headers: { "Content-Type": "application/json" },

apps/mcp-server/src/mcp.ts

Lines changed: 21 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,22 @@ const PART_DESCRIPTIONS: Record<number, string> = {
3737
4: "Transitional Migration Features",
3838
};
3939

40+
/** Shape of an MCP tool definition. Shared with OOXML_TOOL_DEFS so a future
41+
* field added to one (annotations, outputSchema, etc.) widens both arrays. */
42+
export interface ToolDef {
43+
name: string;
44+
description: string;
45+
inputSchema: {
46+
type: "object";
47+
properties: Record<string, unknown>;
48+
required?: string[];
49+
};
50+
}
51+
4052
// Tool definitions
41-
const TOOLS = [
53+
export const TOOLS: ToolDef[] = [
4254
{
43-
name: "search_ecma_spec",
55+
name: "ooxml_search",
4456
description:
4557
"Semantic search across the ECMA-376 (Office Open XML) specification. Returns relevant sections based on natural language queries about WordprocessingML, SpreadsheetML, PresentationML, and more.",
4658
inputSchema: {
@@ -61,7 +73,7 @@ const TOOLS = [
6173
},
6274
},
6375
{
64-
name: "get_section",
76+
name: "ooxml_section",
6577
description:
6678
"Get a specific section of the ECMA-376 specification by section ID (e.g., '17.3.2' for paragraph properties).",
6779
inputSchema: {
@@ -77,7 +89,7 @@ const TOOLS = [
7789
},
7890
},
7991
{
80-
name: "list_parts",
92+
name: "ooxml_parts",
8193
description: "List ECMA-376 specification parts and their top-level sections.",
8294
inputSchema: {
8395
type: "object" as const,
@@ -124,11 +136,11 @@ function handleInitialize(id: number | string | null): JsonRpcResponse {
124136
tools: {},
125137
},
126138
serverInfo: {
127-
name: "ecma-spec",
139+
name: "ooxml",
128140
version: "0.1.0",
129141
},
130142
instructions:
131-
"ECMA-376 (Office Open XML) specification search server. Use search_ecma_spec for semantic search, get_section for specific sections, or list_parts to browse the spec structure.",
143+
"OOXML (ECMA-376 / Office Open XML) reference server. Two tool families: prose search over the spec PDFs (ooxml_search, ooxml_section, ooxml_parts) and deterministic schema lookup over the parsed XSDs (ooxml_element, ooxml_type, ooxml_children, ooxml_attributes, ooxml_enum, ooxml_namespace).",
132144
},
133145
};
134146
}
@@ -173,7 +185,7 @@ async function handleToolsCall(
173185
}
174186

175187
switch (name) {
176-
case "search_ecma_spec": {
188+
case "ooxml_search": {
177189
const query = args?.query as string;
178190
const part = args?.part as number | undefined;
179191
const limit = Math.min((args?.limit as number) || 5, 20);
@@ -194,7 +206,7 @@ async function handleToolsCall(
194206
break;
195207
}
196208

197-
case "get_section": {
209+
case "ooxml_section": {
198210
const sectionId = args?.section_id as string;
199211
const part = args?.part as number | undefined;
200212

@@ -213,7 +225,7 @@ async function handleToolsCall(
213225
break;
214226
}
215227

216-
case "list_parts": {
228+
case "ooxml_parts": {
217229
const part = args?.part as number | undefined;
218230

219231
const db = createDb(env.DATABASE_URL);

apps/mcp-server/src/ooxml-queries.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
/**
22
* Read-only schema-graph queries powering the OOXML MCP tools:
3-
* ooxml_lookup_element, ooxml_lookup_type, ooxml_children,
4-
* ooxml_attributes, ooxml_enum, ooxml_namespace_info.
3+
* ooxml_element, ooxml_type, ooxml_children,
4+
* ooxml_attributes, ooxml_enum, ooxml_namespace.
55
*
66
* These take a tagged-template SQL function (Neon in the deployed Worker,
77
* postgres.js in local tests). All queries are profile-scoped and walk

0 commit comments

Comments
 (0)