Skip to content

Add scip merge command to combine multiple SCIP indexes#420

Draft
jupblb wants to merge 3 commits into
mainfrom
michal/merge
Draft

Add scip merge command to combine multiple SCIP indexes#420
jupblb wants to merge 3 commits into
mainfrom
michal/merge

Conversation

@jupblb
Copy link
Copy Markdown
Collaborator

@jupblb jupblb commented May 19, 2026

The merge command takes two or more SCIP indexes and produces a single combined index. The output project_root is automatically inferred as the common URI ancestor of the inputs'. Use --project-root to override the inferred root.

jupblb added 3 commits May 19, 2026 16:05
The merge command takes two or more SCIP indexes and produces a single
combined index. The output project_root is automatically inferred as the
common URI ancestor of the inputs' project_root values, and each input
document's relative_path is rewritten to be relative to that root.

Use --project-root to override the inferred root (e.g. to root the merged
index at a parent directory).

Example:
  scip merge --output merged.scip a.scip b.scip c.scip

Documents with the same rewritten relative_path and external symbols with
the same name are deduplicated via the existing FlattenDocuments and
FlattenSymbols helpers.
- Fold metadata/version/encoding validation and root-URI parsing into a
  single pass over the input indexes.
- Replace four URI-path helpers (commonAncestorURI, normalizeURIPath,
  commonPathPrefix, relativePrefix) with three plain slash-path helpers
  (commonPath, isAncestor, relativeTo) operating on cleaned paths.
- Normalize project_root paths once at parse time via parseRootURI, so
  the path helpers do not need to special-case trailing slashes, dot
  segments, or empty paths.
- Drop encodingsCompatible and computeOutputRootAndPrefixes wrappers;
  inline their bodies.
- Use strings.CutPrefix in relativeTo.
- Drop the empty-RelativePath special case in document rewriting; rely
  on path.Join's behavior.

Output is unchanged: end-to-end merge of three real Go-project indexes
produces the same 62 documents / 15,340 occurrences / project_root.

Adds tests for protocol-version mismatch, missing metadata, and empty
RelativePath with a non-empty prefix.
Previously mergeIndexes treated TextEncoding_UnspecifiedTextEncoding as
a wildcard: an Unspecified input would be silently 'promoted' to the
encoding of another input, and the merged metadata would be labeled
with that concrete encoding. This silently mislabels source files whose
encoding was never actually known.

Now all inputs must agree exactly on TextDocumentEncoding (Unspecified
counts as itself). Same for ProtocolVersion.

Tests updated to cover both the new error case (Unspecified mixed with
a concrete encoding) and the preservation case (all Unspecified).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant