fix: add thread safety to preload_windows_gl_dlls#421
Conversation
preload_windows_gl_dlls() modifies three module-level mutable collections (_WINDOWS_DLL_HANDLES, _REGISTERED_DLL_DIRS, _PRELOADED_DLL_PATHS) without holding _CONTEXT_LOCK. If multiple threads call envpool.make() concurrently, these shared data structures can be corrupted. Move the shared-state mutation section under _CONTEXT_LOCK to prevent concurrent modification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 85a24a2101
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if dll_path.is_file() and dll_path_str not in _PRELOADED_DLL_PATHS: | ||
| _WINDOWS_DLL_HANDLES.append(win_dll(str(dll_path))) | ||
| _PRELOADED_DLL_PATHS.add(dll_path_str) | ||
| with _CONTEXT_LOCK: |
There was a problem hiding this comment.
Avoid reacquiring the non-reentrant context lock
On Windows when ENVPOOL_DLL_DIR is set to a valid DLL directory, try_ensure_mujoco_glfw_context() already holds _CONTEXT_LOCK while constructing _GlfwContext, and _GlfwContext.__init__() calls preload_windows_gl_dlls(). This new with _CONTEXT_LOCK: therefore tries to acquire the same threading.Lock a second time in the same thread, causing the first GLFW context initialization to hang instead of rendering or falling back. Use an RLock or move the preload outside the outer locked region.
Useful? React with 👍 / 👎.
Trinkle23897
left a comment
There was a problem hiding this comment.
what kind of issue do you want to address? if it's a real issue, could you provide your minimal repo?
|
You're right to question this — I think this PR should be closed entirely. The original code is not buggy. Here's why: The call chain is:
This PR introduces a real P1 deadlock. I'd recommend closing this PR as-is. If thread safety for standalone 🤖 Generated with Claude Code |
|
Closing: the original code has no data race (preload_windows_gl_dlls is always called under _CONTEXT_LOCK via try_ensure_mujoco_glfw_context → _GlfwContext.init), and this PR introduces a deadlock with the non-reentrant threading.Lock. |
Summary
Problem
preload_windows_gl_dlls() modifies three module-level mutable collections (_WINDOWS_DLL_HANDLES, _REGISTERED_DLL_DIRS, _PRELOADED_DLL_PATHS) without holding the existing _CONTEXT_LOCK. Concurrent envpool.make() calls that trigger preload_windows_gl_dlls() can corrupt these shared data structures.
Test plan
🤖 Generated with Claude Code