fix: use commit query for npm packages resolved from git URLs#2863
fix: use commit query for npm packages resolved from git URLs#2863johnsaurabh wants to merge 3 commits into
Conversation
Packages from yarn.lock that are pinned to a git URL (e.g. resolved via git+https://) carry ecosystem=npm because that reflects the file they were found in, but they were never published to the npm registry. Querying OSV.dev by npm ecosystem+name produces false positive matches against unrelated packages that happen to share the same name on npm. The yarnlock extractor already populates SourceCode.Commit for these dependencies via commitextractor.TryExtractCommit. When a commit hash is present for an npm-ecosystem package, use a commit query instead so the lookup is scoped to the actual upstream git commit. Fixes google#2850
Verify that regular npm packages continue to use ecosystem queries, and that npm packages with a populated commit hash (git-pinned via yarn.lock) use a commit query instead.
another-rex
left a comment
There was a problem hiding this comment.
Thanks, LGTM! Can you add an e2e test for this as well? In cmd/osv-scanner/scan/source/command_test.go
|
Hi @another-rex, thanks for the update. Let me know if anything else is needed. Thanks! |
0xTaoZ
left a comment
There was a problem hiding this comment.
Thanks for working on this. The direction makes sense to me, especially keeping the ecosystem as npm and changing only the OSV query shape when a commit is available.
One small edge I think is worth guarding with a unit test: pkgToQuery currently switches any npm package with SourceCode.Commit to a commit query. That matches the yarn.lock git-pinned case, but a future extractor could plausibly populate SourceCode.Commit for a normal npm registry package/version as source metadata. If that happened, the npm ecosystem+version query would be skipped and npm advisories that are not tied to git commits could be missed.
Could you add a regression case that documents the intended boundary here? Either a normal npm registry package with a commit should still use the ecosystem query, or, if commit should always win for npm, a short comment/test name explaining why that is safe would make the behavior much easier to maintain.
Overview
Fixes #2850
pkgToQueryqueried OSV.dev by npm ecosystem+name for all npm-ecosystem packages, including those resolved from git URLs in yarn.lock. These packages were never published to the npm registry, so the query returns false positives from unrelated packages that share the same name on npm.Details
The yarnlock extractor already populates
SourceCode.Commitfor git-pinned dependencies viacommitextractor.TryExtractCommit. When a commit hash is present for an npm-ecosystem package,pkgToQuerynow returns a commit query instead of an ecosystem query.```go
if imodels.Commit(pkg) != "" && imodels.Ecosystem(pkg).Ecosystem == osvconstants.EcosystemNPM {
return &api.Query{
Param: &api.Query_Commit{
Commit: imodels.Commit(pkg),
},
}
}
```
The gate is specific to npm. Other ecosystems are unaffected.
Testing
Checklist