feat(resolution): add C/C++ include path resolution#453
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds C/C++ awareness to the resolver so #include directives and common C/C++ symbols can be handled more accurately during indexing.
Changes:
- Adds
getCppIncludeDirs()toResolutionContextand wires it intoReferenceResolver. - Implements C/C++
#includeextraction and include-path resolution via discovered-Idirectories (compile DB / heuristic). - Adds C/C++ built-in symbol filtering plus unit tests for extraction and resolution.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/resolution/types.ts | Extends resolver context to expose C/C++ include search directories. |
| src/resolution/index.ts | Adds C/C++ built-in symbol filtering and exposes include-dir loading through context. |
| src/resolution/import-resolver.ts | Adds #include parsing, stdlib header filtering, include-dir discovery/caching, and include-dir-based resolution. |
| tests/resolution.test.ts | Adds test coverage for C/C++ include resolution and include-dir discovery. |
| tests/extraction.test.ts | Adds test coverage for C/C++ include extraction into unresolved references. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| c: ['.h', '.c'], | ||
| cpp: ['.h', '.hpp', '.hxx', '.cpp', '.cc', '.cxx'], |
|
|
||
| for (const dir of includeDirs) { | ||
| const normalizedDir = dir.replace(/\\/g, '/'); | ||
| for (const ext of extensions) { | ||
| const candidate = normalizedDir + '/' + importPath + ext; | ||
| if (context.fileExists(candidate)) return candidate; | ||
| } | ||
| // Try as-is (already has extension) | ||
| const candidate = normalizedDir + '/' + importPath; |
| // C/C++ standard library headers — both C-style (<stdio.h>) and | ||
| // C++-style (<cstdio>, <vector>) forms. Checked against the import | ||
| // path (which the extractor strips of <> or "" delimiters). | ||
| if (C_CPP_STDLIB_HEADERS.has(importPath)) return true; | ||
| // C++ headers without .h extension (e.g. "vector", "string") | ||
| const withoutExt = importPath.replace(/\.h$/, ''); | ||
| if (C_CPP_STDLIB_HEADERS.has(withoutExt)) return true; |
| * Handles double-quoted and single-quoted arguments. | ||
| */ | ||
| function shlexSplit(cmd: string): string[] { | ||
| const result: string[] = []; | ||
| let i = 0; | ||
| while (i < cmd.length) { | ||
| // Skip whitespace | ||
| while (i < cmd.length && /\s/.test(cmd[i]!)) i++; | ||
| if (i >= cmd.length) break; | ||
| const ch = cmd[i]!; | ||
| if (ch === '"') { | ||
| i++; | ||
| let arg = ''; | ||
| while (i < cmd.length && cmd[i] !== '"') { | ||
| if (cmd[i] === '\\' && i + 1 < cmd.length) { i++; arg += cmd[i]; } | ||
| else { arg += cmd[i]; } | ||
| i++; | ||
| } | ||
| i++; // closing quote | ||
| result.push(arg); | ||
| } else if (ch === "'") { | ||
| i++; | ||
| let arg = ''; | ||
| while (i < cmd.length && cmd[i] !== "'") { arg += cmd[i]; i++; } | ||
| i++; // closing quote | ||
| result.push(arg); | ||
| } else { | ||
| let arg = ''; | ||
| while (i < cmd.length && !/\s/.test(cmd[i]!)) { arg += cmd[i]; i++; } | ||
| result.push(arg); | ||
| } | ||
| } |
| 'size_t', 'ptrdiff_t', 'wchar_t', 'intptr_t', 'uintptr_t', | ||
| 'int8_t', 'int16_t', 'int32_t', 'int64_t', | ||
| 'uint8_t', 'uint16_t', 'uint32_t', 'uint64_t', | ||
| 'FILE', 'FILE', |
| export function clearImportMappingCache(): void { | ||
| importMappingCache.clear(); | ||
| cppIncludeDirCache.clear(); | ||
| } |
Add full import resolution pipeline for C and C++ #include directives, connecting extracted import nodes to actual header files in the project. - Add C/C++ extension resolution (.h, .hpp, .hxx, .cpp, .cc, .cxx) - Add system header filtering with ~80 C and ~80 C++ stdlib headers - Add extractCppImports() for #include import mapping extraction - Add compile_commands.json parsing for -I/-isystem include directories - Add heuristic include dir discovery (include/, src/, lib/, api/) - Add resolveCppIncludePath() for include directory search - Add C/C++ built-in symbol filtering (printf, malloc, std::*, etc.) - Wire getCppIncludeDirs into ResolutionContext - Add 13 new tests for C/C++ import resolution and extraction Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The PR landed the include-dir scan logic (loadCppIncludeDirs + resolveCppIncludePath) but the indexer never reached it: imports references with referenceName='X.h' fell into resolveViaImport's symbol-lookup branch (matched extractCppImports' basename-without-ext localName via .startsWith, then tried to find a symbol named like the extension and failed). End result on bitcoin-core: 0 new file→file imports vs main, despite the include-dir scan resolving paths correctly when probed directly. resolveViaImport now has a C/C++ imports branch that resolves the include path to the actual file node and returns that — skipping the irrelevant symbol scan. Measured on bitcoin-core: +2,059 newly resolved file→file imports (6,027 → 8,086, +34%). The unconditional CPP_BUILT_INS / C_BUILT_INS filter also misfired: C/C++ codebases routinely shadow stdlib names (bitcoin's mp::move, custom allocators with free/malloc, stream classes with read/write/ close/open, logging libs wrapping printf). Filtering those names killed legitimate edges — 1,179 → 0 for move(), 33 → 0 for free(), 149 → 7 for write() on bitcoin. The filter now defers to hasAnyPossibleMatch: only filter when no user-defined symbol with the name exists. std:: prefix stays unconditional (never user-shadowed in practice). After: printf/free/open/close/read/write/swap all preserved at main's counts; the std::move-binds-to-mp::move false-positives still drop (correctly: −2,154 C/C++ calls). Also: drop the duplicate 'FILE' in C_BUILT_INS; add an end-to-end test that asserts `#include "X.h"` produces a file→file imports edge in the real indexing pipeline (not just direct resolver probes); add a test documenting the cross-language `.h` heuristic claim (Obj-C dirs are intentionally allowed as C/C++ include dirs); add CHANGELOG entry under [Unreleased] with measured numbers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cbe460f to
5f6830f
Compare
|
Huge thanks for this @g122622 — C/C++ include resolution was a real gap and the bones of this PR were solid. The I pushed a follow-up review commit (5f6830f) before merging to fix two things I caught with an A/B on bitcoin-core (1,989 files) + tinyxml2:
Also added an end-to-end test that exercises |
Add full import resolution pipeline for C and C++ #include directives, connecting extracted import nodes to actual header files in the project.