Skip to content

Lock all ScriptCompilationSystem mutators and snapshot readers#150

Open
JustinMDotNet wants to merge 1 commit into
wopss:masterfrom
JustinMDotNet:fix/script-compilation-system-locking
Open

Lock all ScriptCompilationSystem mutators and snapshot readers#150
JustinMDotNet wants to merge 1 commit into
wopss:masterfrom
JustinMDotNet:fix/script-compilation-system-locking

Conversation

@JustinMDotNet

Copy link
Copy Markdown

Closes #140.

What this changes

Three pieces:

  1. Every public mutator and accessor on ScriptCompilationSystem now takes m_mutex. Previously only the private Add(plugin, path) overload was locked; SetScriptsBlob, SetModdedScriptsBlob, RegisterNeverRefType, RegisterMixedRefType, and every HasXxx / GetXxx ran with no synchronization.
  2. The list / map getters now return by value instead of const T&. GetScriptPaths, GetNeverRefTypes, GetMixedRefTypes, GetScriptsBlob, GetModdedScriptsBlob take the lock, snapshot the container, and return the copy. This makes the returned object safe to use after the lock is released.
  3. GetCompilationArgs takes the lock for its full iteration of m_scriptPaths. Previously it iterated unlocked while plugin workers could be inserting via Add.

m_mutex is now mutable so the const accessors can lock it.

Why

v1 plugins can call these from any thread, but the script-compile hook runs on the game thread. Concrete races the previous code allowed:

  • A plugin worker calls RegisterNeverRefType while the compile hook iterates m_neverRefTypes in ExecuteProcess.cpp:117. vector::emplace_back can reallocate mid-iteration.
  • The compile hook iterates m_scriptPaths in GetCompilationArgs while another worker is inside the locked Add(plugin, path) — that worker's mutation could trigger unordered_multimap rehash mid-iteration.
  • Two plugins racing on RegisterMixedRefType corrupt the vector because the inner write was never locked.

Call site impact

Two callers were touched:

  • ExecuteProcess.cpp:96, 102, 112, 117, 122 — keeps compiling because the range-for over an rvalue Map_t / vector extends the temporary's lifetime for the duration of the loop, and the ternary in :96 now produces an rvalue path that decays into the existing const auto blobPath binding identically.
  • LoadScripts.cpp:22-24 — unchanged in shape; .string() is called on the rvalue path instead of an lvalue reference and produces the same std::string either way.

Design choices

  • Snapshot-on-read instead of returning iterator pairs under an external lock. The copies are not in any hot path (the compile hook runs once per scc invocation, not per frame), so the simplicity wins. If profiling ever flags this, the alternative is to expose a WithScriptPaths(callback) style scoped accessor.
  • Did not introduce a combined HasScriptsBlob() + GetScriptsBlob() accessor. There is a remaining TOCTOU between those two calls — a plugin worker setting the blob between the test and the get would still produce a "consistent but stale" read. The fix for that is API shape, not locking. Worth a follow-up issue if you want me to file one.
  • Kept the existing public/private signature split. The public mutators that take std::string by value now lock around the move-into-vector; the move is cheap so the critical section stays small.

What I tested

Local rebuild and a read-through of the diff against both call sites. No runtime tests — the race is hard to repro on demand. The new lock placement matches the existing Add(plugin, path) overload's pattern exactly.

Previously only the private Add(plugin, path) overload took
m_mutex. Every other mutator (SetScriptsBlob, SetModdedScriptsBlob,
RegisterNeverRefType, RegisterMixedRefType) and every reader
(HasXxx, GetXxx, GetCompilationArgs, GetScriptPaths) ran with no
synchronization. The v1 ABI lets plugin worker threads call the
mutators while the script-compile hook on the game thread iterates
the same containers, which can rehash an unordered_multimap or
realloc a vector mid-read.

Take m_mutex in every public mutator and every public accessor.
Make m_mutex mutable so the const accessors can lock it. Switch
the list/map getters from returning const T& to returning T by
value so callers can safely use the result after the lock is
released. Take the lock for the full iteration in
GetCompilationArgs as well.

The two consumers in Hooks/ExecuteProcess.cpp and
Hooks/LoadScripts.cpp still compile unchanged: range-for over an
rvalue container extends the temporary's lifetime, and the
ternaries deduce the same common type as before.

Fixes wopss#140

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ScriptCompilationSystem has inconsistent locking around its shared state

2 participants