Parser proposal: Xygeni JSON reports
Scanner Name
Xygeni — Software Supply Chain Security platform with multiple scanners
(SAST, SCA, secrets, IaC, CI/CD, DAST, suspect-deps, code-tampering).
Site: https://xygeni.io · Docs: https://docs.xygeni.io
Sample File(s)
Format: JSON, one report per scanner, all sharing a common metadata envelope.
Trimmed inline samples are included under each per-kind section below
(SAST, SCA, Secrets). Full anonymized sample reports for the three phase-1
scan types will be attached as comments on this issue, and checked into
unittests/scans/xygeni/ as part of the phase-1 PR.
About Xygeni
Xygeni is a platform for improving the Software Supply
Chain Security posture for organizations. The platform provides a set of
different scanners specialized in different software security domains: code
vulnerabilities (SAST), vulnerabilities in open-source components (SCA),
hard-coded secrets, flaws in IaC templates, vulnerabilities in web
applications (DAST), misconfigurations in version control and CI/CD systems,
or malicious behavior in owned and third-party components.
Proposal
Each Xygeni scanner emits a JSON report with a shared metadata envelope and a
kind-specific payload. We'd like to add a single first-party parser at
dojo/tools/xygeni/ that dispatches on metadata.scanType and routes to a
per-kind handler. This mirrors established precedent in dojo/tools/
(rusty_hog, anchore_grype, checkmarx, sonarqube, mobsf) and avoids
adding one near-duplicate xygeni_* parser per scanner.
The intent of this issue is to obtain pre-approval on:
- The shape — one parser with multiple scan types, dispatched on
metadata.scanType.
- The phase-1 mappings — the field-by-field translations from the SAST,
SCA, and Secrets JSON to DefectDojo Finding objects, documented below.
- The phasing — approving the structure now and growing it through
focused follow-up PRs, each adding one or more additional scan types when
that grouping is natural.
The parser will follow the Contribute to Parsers
recommendations from the DefectDojo documentation.
In scope (this proposal and the phase-1 PR that follows)
- A single parser package at
dojo/tools/xygeni/ exposing three scan types:
Xygeni SAST Scan, Xygeni SCA Scan, Xygeni Secrets Scan.
- A per-kind handler module for each (
sast.py, sca.py, secrets.py)
invoked from a thin XygeniParser.get_findings().
- Severity, dedup, and CWE/CVE conversion utilities shared across the three
kinds (_common.py).
- Unit tests under
unittests/tools/test_xygeni_parser.py.
- Real Xygeni sample reports checked in as test fixtures under
unittests/scans/xygeni/{sast,sca,secrets}_*.json (empty report,
multi-finding report covering all severities, plus targeted edge-case
fixtures per kind).
- A documentation page at
docs/content/en/connecting_your_tools/parsers/file/xygeni.md covering all
three scan types and pointing to the Xygeni CLI commands that produce the
matching JSON.
Out of scope (future follow-up PRs, listed for context only)
These Xygeni scan kinds are not part of this proposal. We mention them so
the maintainers can see the full direction and approve the parser structure
once. They will be delivered through follow-up PRs for additional scan types that extend
XygeniParser.get_scan_types() and add the corresponding handlers, fixtures,
and docs sections:
Xygeni IaC Scan — Terraform / CloudFormation / Kubernetes / Dockerfile flaws.
Xygeni CICD Misconfig Scan — pipeline and SCM misconfigurations.
Xygeni DAST Scan — web-application dynamic findings.
Xygeni Suspect Dependencies Scan — typosquatting / anomaly / malware signals.
Xygeni Code Tampering Scan — code-integrity violations.
Common backbone
Every Xygeni report has the same metadata envelope and a stable per-finding
backbone:
| Xygeni field |
DefectDojo Finding field |
Notes |
metadata.scanType |
(dispatch only) |
sast / deps / secrets / ... |
<finding>.uniqueHash |
unique_id_from_tool |
Vendor-stable; guarantees re-import dedup |
<finding>.issueId |
vuln_id_from_tool |
|
<finding>.severity |
severity |
Titlecased: critical→Critical, high→High, medium→Medium, low→Low, info→Info |
location.{filepath, beginLine, endLine, code} is shared by SAST and Secrets.
SCA uses package coordinates instead. DAST uses URL+method (out of scope here).
SAST — Xygeni SAST Scan
vulnerabilities[] is the primary array. detector is the rule id.
A subset of findings carry a SARIF-style codeFlows[] block with source/sink
frames and a data path; the parser renders that into the description and
populates DefectDojo's SAST source/sink fields.
Sample finding (taint flow, critical-severity):
{
"metadata": {"scanType": "sast", "format": "sast-xygeni"},
"vulnerabilities": [{
"detector": "python.code_injection_deserialization",
"kind": "injection",
"severity": "critical",
"confidence": "high",
"language": "python",
"location": {
"filepath": "main.py", "beginLine": 36,
"code": "pickle.loads(decoded_data)"
},
"cwe": 502,
"cwes": ["CWE-502"],
"tags": ["CWE:502", "OWASP:2021:A8"],
"explanation": "Untrusted input deserialized via pickle.loads enables RCE.",
"codeFlows": [{
"frames": [
{"kind": "source", "...": "..."},
{"kind": "sink", "...": "..."}
]
}],
"uniqueHash": "N0JJTPOJPJBHZw0haLys5Q",
"issueId": "SAS.injection.python.code_injection_deserialization.main.py.36"
}]
}
Field mapping:
DefectDojo Finding |
Xygeni source |
title |
detector |
description |
explanation + location.code + rendered codeFlows |
file_path |
location.filepath |
line |
location.beginLine |
cwe |
cwe (numeric) |
sast_source_file_path / sast_source_line |
first codeFlows[].frames[] source, when present |
sast_sink_object |
first codeFlows[].frames[] sink, when present |
static_finding |
True |
SCA — Xygeni SCA Scan
Top-level dependencies[], each with nested vulnerabilities[] (CVE/GHSA).
One Finding per dependencies[].vulnerabilities[] entry.
Sample finding:
{
"metadata": {"scanType": "deps", "format": "deps-xygeni"},
"dependencies": [{
"name": "cookie", "version": "0.5.0", "ecosystem": "npm",
"vulnerabilities": [{
"id": "CVE-2024-47764",
"cve": "CVE-2024-47764",
"severity": "low",
"fixedVersion": "0.7.0",
"aliases": ["GHSA-pxg6-pf52-xh8x"],
"overallCvssScore": -1.0,
"references": [
"https://github.com/jshttp/cookie/security/advisories/GHSA-pxg6-pf52-xh8x"
],
"uniqueHash": "CVE-2024-47764#:cookie:0.5.0:javascript",
"issueId": "SCA.CVE-2024-47764",
"description": "..."
}]
}]
}
Field mapping:
DefectDojo Finding |
Xygeni source |
title |
cve (fall back to id) |
description |
description |
cve |
cve |
cwe |
cwes[0] if present |
cvssv3_score |
overallCvssScore when ≥ 0 |
mitigation |
"Upgrade to {fixedVersion}" |
references |
references joined |
component_name |
parent dependencies[].name |
component_version |
parent dependencies[].version |
Secrets — Xygeni Secrets Scan
secrets[] is the primary array. The Xygeni report already redacts the secret
value in both secret and location.code — the raw value never appears, so
the parser surfaces those fields as-is.
Sample finding:
{
"metadata": {"scanType": "secrets", "format": "secrets-xygeni"},
"secrets": [{
"secret": "AKIA****REDACTED****",
"hash": "9d5e...",
"type": "aws_access_key",
"detector": "aws-access-key",
"severity": "high",
"confidence": "high",
"location": {
"filepath": "aws.properties", "beginLine": 7,
"code": "aws.access.key=AKIA****"
},
"description": "AWS access key ID detected.",
"tags": ["secret:aws", "cwe:798"],
"uniqueHash": "abc...",
"issueId": "SECRETS.aws-access-key.aws.properties:7"
}]
}
Field mapping:
DefectDojo Finding |
Xygeni source |
title |
"{type} secret detected in {filename}" |
description |
description + location.code |
file_path |
location.filepath |
line |
location.beginLine |
cwe |
first cwe:N tag, else 798 |
mitigation |
"Rotate this {type} secret and remove from history." |
static_finding |
True |
Layout
dojo/tools/xygeni/
├── __init__.py
├── parser.py # XygeniParser, dispatches on metadata.scanType
├── sast.py
├── sca.py
├── secrets.py
└── _common.py # severity map, dedup helpers
unittests/scans/xygeni/{sast,sca,secrets}_*.json
unittests/tools/test_xygeni_parser.py
docs/content/en/connecting_your_tools/parsers/file/xygeni.md
PRs originate from xygeni/django-DefectDojo (public org fork) against dev.
Questions
- Is one parser dispatching on
metadata.scanType preferred, given the
rusty_hog / anchore_grype precedent? Or should we split into
xygeni_sast / xygeni_sca / xygeni_secrets?
- Any objection to setting
vuln_id_from_tool = issueId alongside
unique_id_from_tool = uniqueHash?
- OK to approve this structure now, with the phase-2 scan types
(IaC / CICD / DAST / suspect-deps / code-tampering) added to the same
parser in follow-up PRs?
References:
DefectDojo parser contributor guide ·
Xygeni docs
Parser proposal: Xygeni JSON reports
Scanner Name
Xygeni — Software Supply Chain Security platform with multiple scanners
(SAST, SCA, secrets, IaC, CI/CD, DAST, suspect-deps, code-tampering).
Site: https://xygeni.io · Docs: https://docs.xygeni.io
Sample File(s)
Format: JSON, one report per scanner, all sharing a common
metadataenvelope.Trimmed inline samples are included under each per-kind section below
(SAST, SCA, Secrets). Full anonymized sample reports for the three phase-1
scan types will be attached as comments on this issue, and checked into
unittests/scans/xygeni/as part of the phase-1 PR.About Xygeni
Xygeni is a platform for improving the Software Supply
Chain Security posture for organizations. The platform provides a set of
different scanners specialized in different software security domains: code
vulnerabilities (SAST), vulnerabilities in open-source components (SCA),
hard-coded secrets, flaws in IaC templates, vulnerabilities in web
applications (DAST), misconfigurations in version control and CI/CD systems,
or malicious behavior in owned and third-party components.
Proposal
Each Xygeni scanner emits a JSON report with a shared
metadataenvelope and akind-specific payload. We'd like to add a single first-party parser at
dojo/tools/xygeni/that dispatches onmetadata.scanTypeand routes to aper-kind handler. This mirrors established precedent in
dojo/tools/(
rusty_hog,anchore_grype,checkmarx,sonarqube,mobsf) and avoidsadding one near-duplicate
xygeni_*parser per scanner.The intent of this issue is to obtain pre-approval on:
metadata.scanType.SCA, and Secrets JSON to DefectDojo
Findingobjects, documented below.focused follow-up PRs, each adding one or more additional scan types when
that grouping is natural.
The parser will follow the Contribute to Parsers
recommendations from the DefectDojo documentation.
In scope (this proposal and the phase-1 PR that follows)
dojo/tools/xygeni/exposing three scan types:Xygeni SAST Scan,Xygeni SCA Scan,Xygeni Secrets Scan.sast.py,sca.py,secrets.py)invoked from a thin
XygeniParser.get_findings().kinds (
_common.py).unittests/tools/test_xygeni_parser.py.unittests/scans/xygeni/{sast,sca,secrets}_*.json(empty report,multi-finding report covering all severities, plus targeted edge-case
fixtures per kind).
docs/content/en/connecting_your_tools/parsers/file/xygeni.mdcovering allthree scan types and pointing to the Xygeni CLI commands that produce the
matching JSON.
Out of scope (future follow-up PRs, listed for context only)
These Xygeni scan kinds are not part of this proposal. We mention them so
the maintainers can see the full direction and approve the parser structure
once. They will be delivered through follow-up PRs for additional scan types that extend
XygeniParser.get_scan_types()and add the corresponding handlers, fixtures,and docs sections:
Xygeni IaC Scan— Terraform / CloudFormation / Kubernetes / Dockerfile flaws.Xygeni CICD Misconfig Scan— pipeline and SCM misconfigurations.Xygeni DAST Scan— web-application dynamic findings.Xygeni Suspect Dependencies Scan— typosquatting / anomaly / malware signals.Xygeni Code Tampering Scan— code-integrity violations.Common backbone
Every Xygeni report has the same
metadataenvelope and a stable per-findingbackbone:
Findingfieldmetadata.scanTypesast/deps/secrets/ ...<finding>.uniqueHashunique_id_from_tool<finding>.issueIdvuln_id_from_tool<finding>.severityseveritycritical→Critical,high→High,medium→Medium,low→Low,info→Infolocation.{filepath, beginLine, endLine, code}is shared by SAST and Secrets.SCA uses package coordinates instead. DAST uses URL+method (out of scope here).
SAST —
Xygeni SAST Scanvulnerabilities[]is the primary array.detectoris the rule id.A subset of findings carry a SARIF-style
codeFlows[]block with source/sinkframes and a data path; the parser renders that into the description and
populates DefectDojo's SAST source/sink fields.
Sample finding (taint flow, critical-severity):
{ "metadata": {"scanType": "sast", "format": "sast-xygeni"}, "vulnerabilities": [{ "detector": "python.code_injection_deserialization", "kind": "injection", "severity": "critical", "confidence": "high", "language": "python", "location": { "filepath": "main.py", "beginLine": 36, "code": "pickle.loads(decoded_data)" }, "cwe": 502, "cwes": ["CWE-502"], "tags": ["CWE:502", "OWASP:2021:A8"], "explanation": "Untrusted input deserialized via pickle.loads enables RCE.", "codeFlows": [{ "frames": [ {"kind": "source", "...": "..."}, {"kind": "sink", "...": "..."} ] }], "uniqueHash": "N0JJTPOJPJBHZw0haLys5Q", "issueId": "SAS.injection.python.code_injection_deserialization.main.py.36" }] }Field mapping:
Findingtitledetectordescriptionexplanation+location.code+ renderedcodeFlowsfile_pathlocation.filepathlinelocation.beginLinecwecwe(numeric)sast_source_file_path/sast_source_linecodeFlows[].frames[]source, when presentsast_sink_objectcodeFlows[].frames[]sink, when presentstatic_findingTrueSCA —
Xygeni SCA ScanTop-level
dependencies[], each with nestedvulnerabilities[](CVE/GHSA).One
Findingperdependencies[].vulnerabilities[]entry.Sample finding:
{ "metadata": {"scanType": "deps", "format": "deps-xygeni"}, "dependencies": [{ "name": "cookie", "version": "0.5.0", "ecosystem": "npm", "vulnerabilities": [{ "id": "CVE-2024-47764", "cve": "CVE-2024-47764", "severity": "low", "fixedVersion": "0.7.0", "aliases": ["GHSA-pxg6-pf52-xh8x"], "overallCvssScore": -1.0, "references": [ "https://github.com/jshttp/cookie/security/advisories/GHSA-pxg6-pf52-xh8x" ], "uniqueHash": "CVE-2024-47764#:cookie:0.5.0:javascript", "issueId": "SCA.CVE-2024-47764", "description": "..." }] }] }Field mapping:
Findingtitlecve(fall back toid)descriptiondescriptioncvecvecwecwes[0]if presentcvssv3_scoreoverallCvssScorewhen ≥ 0mitigation"Upgrade to {fixedVersion}"referencesreferencesjoinedcomponent_namedependencies[].namecomponent_versiondependencies[].versionSecrets —
Xygeni Secrets Scansecrets[]is the primary array. The Xygeni report already redacts the secretvalue in both
secretandlocation.code— the raw value never appears, sothe parser surfaces those fields as-is.
Sample finding:
{ "metadata": {"scanType": "secrets", "format": "secrets-xygeni"}, "secrets": [{ "secret": "AKIA****REDACTED****", "hash": "9d5e...", "type": "aws_access_key", "detector": "aws-access-key", "severity": "high", "confidence": "high", "location": { "filepath": "aws.properties", "beginLine": 7, "code": "aws.access.key=AKIA****" }, "description": "AWS access key ID detected.", "tags": ["secret:aws", "cwe:798"], "uniqueHash": "abc...", "issueId": "SECRETS.aws-access-key.aws.properties:7" }] }Field mapping:
Findingtitle"{type} secret detected in {filename}"descriptiondescription+location.codefile_pathlocation.filepathlinelocation.beginLinecwecwe:Ntag, else798mitigation"Rotate this {type} secret and remove from history."static_findingTrueLayout
PRs originate from
xygeni/django-DefectDojo(public org fork) againstdev.Questions
metadata.scanTypepreferred, given therusty_hog/anchore_grypeprecedent? Or should we split intoxygeni_sast/xygeni_sca/xygeni_secrets?vuln_id_from_tool = issueIdalongsideunique_id_from_tool = uniqueHash?(IaC / CICD / DAST / suspect-deps / code-tampering) added to the same
parser in follow-up PRs?
References:
DefectDojo parser contributor guide ·
Xygeni docs