Skip to content

epub: guard against a NULL head node in add_mathjax_script_node_to_file#719

Open
maxheise wants to merge 1 commit into
linuxmint:masterfrom
maxheise:epub-mathjax-head-null-deref
Open

epub: guard against a NULL head node in add_mathjax_script_node_to_file#719
maxheise wants to merge 1 commit into
linuxmint:masterfrom
maxheise:epub-mathjax-head-null-deref

Conversation

@maxheise
Copy link
Copy Markdown

@maxheise maxheise commented Jun 6, 2026

Hello,

When an EPUB content document is flagged for MathML (its OPF manifest
item carries properties="mathml"), the backend injects a MathJax script
element into that document's head so the math renders. That injection is
add_mathjax_script_node_to_file() in backend/epub/epub-document.c, which
finds the head by walking the children of the document root:

xmlNodePtr head = mathroot->children;
while(head != NULL) {
    if (!xmlStrcmp(head->name,(xmlChar*)"head")) {
        break;
    }
    head = head->next;
}
if (xmlStrcmp(head->name,(xmlChar*)"head")) {
    return ;
}

If a head child is found, the loop breaks with head valid. If none is
found, or the root has no children, the loop exits with head == NULL,
and the next line reads head->name on that NULL pointer and crashes.

A normal XHTML page has a head, so this does not happen for well-formed
EPUBs. It is reachable on malformed or unusual input: the path of the
file injected into comes straight from the manifest href, and nothing
checks its structure before parsing. A content document whose root has
no head child (an html with only a body, a non-html root, or an empty
root) reaches the dereference. Since EPUB files are untrusted input,
this is a robustness fix rather than a fix for a crash on valid
documents.

The change adds a head == NULL guard before the dereference; the ||
short-circuits, so a NULL head returns cleanly and a non-NULL head is
unchanged.

I did consider going further and creating a head element when one is
missing, so the MathJax script could still be inserted and the math
would render. I decided against it. A normal EPUB already has a head, so
that case is rare, and doing it properly is noticeably more work (it is
only valid when the root is html, and the new head has to be placed as
the first child, before body). For almost no practical gain it would add
behaviour and complexity to a path that should simply decline to touch a
malformed document, which is what the function already does when the
root element is missing. So this change only prevents the crash.

Thanks for taking a look.

When an EPUB content document is flagged for MathML (its OPF manifest
item carries properties="mathml"), the backend injects a MathJax script
element into that document's head so the math renders. That injection is
add_mathjax_script_node_to_file() in backend/epub/epub-document.c. To
find the head it walks the children of the document root:

    xmlNodePtr head = mathroot->children;
    while(head != NULL) {
        if (!xmlStrcmp(head->name,(xmlChar*)"head")) {
            break;
        }
        head = head->next;
    }
    if (xmlStrcmp(head->name,(xmlChar*)"head")) {
        return ;
    }

When a head child is found the loop exits via break with head non-NULL.
When no child is named head, or the root has no children, the loop
instead exits because head has become NULL, and the following
xmlStrcmp(head->name, ...) then dereferences NULL and crashes.

A well-formed XHTML content document always has a head element, so this
does not occur for normal EPUBs. It is reachable on malformed or unusual
input: the path of the file injected into is taken verbatim from the
manifest href, and nothing validates its structure before this function
parses it. A content document with a root that has no head child (for
example an html with only a body, a non-html root, or an empty root)
reaches the dereference. EPUB files are untrusted input, so this is a
robustness fix, not a fix for a crash on valid documents.

Guard the check with head == NULL before the dereference. The || short
circuits, so a NULL head returns without touching head->name; a non-NULL
head behaves exactly as before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant