Skip to content

Unhandled XML ParseError in import-scan crashes uwsgi worker (api_v2/serializers.py:2472) #14752

@bjornmage

Description

@bjornmage

Description

When a malformed XML payload is uploaded to /api/v2/import-scan/ (or /api/v2/reimport-scan/), an xml.etree.ElementTree.ParseError: not well-formed (invalid token) propagates uncaught out of the parser path invoked from dojo/api_v2/serializers.py:2472. The uwsgi worker handling the request dies. Repeated POSTs of the same malformed payload cause the defectdojo-django Pod to accumulate restarts — we observed 41 restarts in a single CI integration session before we added a client-side preflight.

A parse error in user-supplied input should be returned to the client as 400 Bad Request with a clear error message, not crash the worker process.

Environment

  • DefectDojo: v2.55.4 (Helm chart defectdojo/defectdojo 1.9.14)
  • Deployment: Kubernetes, uwsgi backend, PostgreSQL
  • Path: POST /api/v2/import-scan/ with scan_type whose parser uses xml.etree.ElementTree

Reproduction

  1. Authenticate to the API and obtain a token.
  2. Submit a malformed XML file as file to /api/v2/import-scan/. Minimal repro: a payload that opens a tag without closing it, e.g. <root (no >, no closing tag) or content truncated mid-element.
    curl -X POST -H \"Authorization: Token <TOKEN>\" \\
      -F \"scan_type=Trivy Scan\" \\
      -F \"engagement=<id>\" \\
      -F \"file=@malformed.xml\" \\
      https://defectdojo.example/api/v2/import-scan/
  3. Observe in the django logs:
    xml.etree.ElementTree.ParseError: not well-formed (invalid token): line N, column M
      File \"/app/dojo/api_v2/serializers.py\", line 2472, in <method>
    
  4. The uwsgi worker for that request terminates. The container's restart counter increments after the worker pool exhausts retries. Repeated POSTs of the same payload accumulate restarts.

Expected behavior

The XML parsing call site should wrap ET.parse / ET.fromstring in try/except xml.etree.ElementTree.ParseError and translate the failure into a serializers.ValidationError (or equivalent), returning 400 Bad Request to the client with the parse error message. The worker process should not die on user input.

Workaround (defensive, client-side)

We added an xmllint --noout preflight step in our scan importer pipeline so malformed XML is rejected before the POST reaches DefectDojo. Reference (private repo): https://git.developerdojo.org/DroidOpsInc/launch-sequence/-/merge_requests/128

This protects our deployment but does not fix the underlying server-side issue — any other client (or hostile input) can still kill workers.

Suggested fix sketch

At the parser invocation around dojo/api_v2/serializers.py:2472 (and any sibling call sites that consume user-uploaded XML), wrap the parse call:

import xml.etree.ElementTree as ET
from rest_framework import serializers

try:
    tree = ET.parse(scan_file)
except ET.ParseError as e:
    raise serializers.ValidationError({\"file\": f\"Malformed XML: {e}\"})

Happy to open a PR if a maintainer can confirm the desired exception surface (DRF ValidationError vs a more specific ParseError subclass).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions