libobs: Fix GPU encode thread race conditions#13285
Conversation
|
Please include a crash report, and if possible, split the commits based on each fix. |
|
Unhandled exception: c0000005 Thread 257C: obs gpu encode thread (Crashed) Thread 8280: Thread 7B20: Thread 330C: Thread 2510: libobs: hotkey thread Thread 768C: tiny_tubular_task_thread Thread 811C: audio-io: audio thread Thread 18B8: Thread 2290: Thread 75D0: Thread 7F20: Thread 7F98: Thread 7BD8: Thread 80A8: Thread 7D30: Thread 72A8: Thread 7E9C: Thread 810C: Thread 62F0: Thread 7CB0: Thread 63F0: Thread 7EE8: Thread 6A78: Thread 78B8: Thread 2DA0: Thread 42CC: Thread 75E0: Thread 1770: Thread B44: Thread 3320: Thread 2AE4: Thread 61A8: Thread 6308: Thread 8384: Thread 6B08: Thread 2090: video-io: video thread Thread 7A40: libobs: graphics thread Thread 2998: Thread 4554: Thread 2FA4: Thread 754C: Thread 7DC4: scripting: defer Thread 6A9C: Thread 4E78: Thread 3E94: Thread 7B54: Thread EE4: Thread 42D4: Thread 3D88: win-dshow: DShowThread Thread 1560: Thread 679C: win-wasapi: reconnect thread Thread 7A14: win-wasapi: reconnect thread Thread 7094: Thread 7488: Thread DDC: Thread A3C: Thread 12D0: Thread 1E14: Thread 8AC: Thread 2440: Thread 7598: Thread 6320: Thread 7208: Thread 324C: Thread 5EB0: win-dshow: DShowThread Thread 514C: win-dshow: DShowThread Thread 704C: Thread 53CC: win-dshow: DShowThread Thread 712C: CrBrowserMain Thread 7294: ThreadPoolServiceThread Thread 2C70: ThreadPoolForegroundWorker Thread 1EE8: ThreadPoolBackgroundWorker Thread 60B8: ThreadPoolForegroundWorker Thread 7758: Chrome_IOThread Thread 1528: ThreadPoolForegroundWorker Thread 7764: MemoryInfra Thread 65D0: ThreadPoolSingleThreadCOMSTASharedForeground0 Thread 7B10: ThreadPoolSingleThreadCOMSTASharedForegroundBlocking1 Thread 693C: Thread 34C0: ThreadPoolSingleThreadForegroundBlocking2 Thread 716C: CacheThread_BlockFile Thread 7DDC: CompositorTileWorker1 Thread 8C8: VideoCaptureThread Thread 7BD4: ThreadPoolForegroundWorker Thread 6440: ThreadPoolForegroundWorker Thread 7698: ThreadPoolSingleThreadSharedBackgroundBlocking3 Thread 6AB8: ThreadPoolSingleThreadSharedForegroundBlocking4 Thread 81A0: ThreadPoolForegroundWorker Thread 7F84: ThreadPoolForegroundWorker Thread 2B8C: ThreadPoolSingleThreadSharedForeground5 Thread 76C8: ThreadPoolForegroundWorker Thread 46DC: ThreadPoolForegroundWorker Thread 22B4: Thread 5D4: ThreadPoolBackgroundWorker Thread 2738: win-dshow: DShowThread Thread 83B4: Thread 66B0: mp_media_thread Thread 78BC: Thread 36DC: Thread B78: Thread 4B44: Thread 66C8: Thread 536C: Thread 687C: Thread 7198: Thread 6D18: Thread 8118: Thread 76F8: Thread 6954: Thread 149C: Thread 71FC: Thread 6A58: Thread 6CD4: Thread 8138: Thread 5E50: Thread 6EC0: Thread 7E94: Thread 3AB8: Thread 6E2C: mp_media_thread Thread 74BC: Thread 7390: Thread 7F6C: Thread 6F4C: Thread 77D8: Thread 6284: Thread 781C: Thread 3EE4: Thread 80A4: Thread 7B24: Thread 6854: Thread 794: Thread 6088: Thread 74A0: Thread 3E70: Thread 72E0: Thread 5AC: Thread 7438: Thread 4B3C: Thread 7C40: Thread 65AC: QThread Thread 6E18: win-wasapi: reconnect thread Thread 83A8: win-wasapi: reconnect thread Thread 5C68: Thread 5E14: mp_cache_thread Thread 6950: Thread 758C: Thread 3BEC: Thread 12D4: Thread 2468: Thread 1D0: Thread 7270: Thread 6CF8: Thread 5D94: Thread 2C7C: Thread 53D0: Thread 1360: Thread 66B8: Thread F14: Thread 6430: Thread 79D8: Thread 326C: Thread 2710: Thread 20D0: Thread 77E4: Thread 73D4: Thread 3774: Thread 32E8: Thread 7A74: Thread 728C: Thread 3720: Thread 65C8: Thread 7CC8: mp_cache_thread Thread 6490: ThreadPoolForegroundWorker Thread C30: ThreadPoolForegroundWorker Thread 4F40: ThreadPoolForegroundWorker Thread 727C: Thread 1048: Thread 1FC0: video-io: video thread Thread 2BBC: video-io: video thread Thread 2A60: Thread 6024: mp_cache_thread Thread 245C: mp_cache_thread Thread 1880: ThreadPoolSingleThreadCOMSTASharedForegroundBlocking6 Thread 5928: DManip Delegate Thread Thread 82E0: |
|
As for the split commits, both bugs were discovered as a result of the crash investigation and are fixes to the same module, so I'm a bit uncomfortable with claiming each fix is testable independent of the other. if the review process requires it, I'll be happy to do what I can. Thank you for the speedy feedback. I hope this patch will help others who may have experienced this situation, and prevent future users of having to deal with it. |
|
Instead of a new noref function I would expect 2 releases of a weak reference in the stop gpu encode function |
Fix two race conditions in the GPU encode thread that could cause access violations or deadlocks when stopping stream output. Bug 1 - paired_encoders dangling pointer: gpu_encode_thread() cached raw pointers from encoder->paired_encoders without holding any lock. Concurrently, obs_encoder_shutdown() on another thread could call da_free() on the same backing buffer, leaving the GPU thread with a dangling pointer. A blocking lock was not viable because it deadlocks with obs_encoder_stop(), which holds init_mutex while waiting on gpu_encode_inactive. Fix: Use pthread_mutex_trylock on encoder->init_mutex. If unavailable the encoder is mid-shutdown; skip it for the cycle so gpu_encode_inactive can be signalled. If the lock is acquired, snapshot paired_encoders via da_copy and addref each weak ref under the lock, then iterate outside it. Bug 2 - freed encoder struct in gpu_encoders: gpu_encoders stored raw obs_encoder_t* pointers. If the last external strong ref was dropped before stop_gpu_encode() erased the entry, obs_encoder_destroy() could bfree() the encoder struct while the GPU thread read from it to reach context.control. Fix: Change gpu_encoders to store obs_weak_encoder_t* (control struct) pointers. Take a weak-ref addref in start_gpu_encode and release both it and a search ref in stop_gpu_encode after os_event_wait. The GPU thread calls obs_weak_encoder_get_encoder() directly, reading only from the control struct, never from the encoder struct.
8083845 to
33d71fa
Compare
|
That does sound like the best approach. Thanks. |
|
Instead of merge commits, we generally want PRs to be rebased as to not litter master with a bunch of merge commits. |
derrod
left a comment
There was a problem hiding this comment.
This seems reasonable, I haven't thouroughly tested this, but it has my approval, styling nitpicks aside.
|
Apologies for the repeated merges. I appreciate the insight into the process. |
Description
I did not find documentation describing the GPU encode thread usage, so this description currently serves as the only information I've been able to contribute to documenting the encoder threads.
Fix two race conditions in the GPU encode thread that could cause access violations or deadlocks when stopping stream output.
Bug 1 - paired_encoders dangling pointer: gpu_encode_thread() cached raw pointers from encoder->paired_encoders without holding any lock. Concurrently, obs_encoder_shutdown() on another thread could call da_free() on the same backing buffer, leaving the GPU thread with a dangling pointer. A blocking lock was not viable because it deadlocks with obs_encoder_stop(), which holds init_mutex while waiting on gpu_encode_inactive.
Fix: Use pthread_mutex_trylock on encoder->init_mutex. If unavailable the encoder is mid-shutdown; skip it for the cycle so gpu_encode_inactive can be signalled. If the lock is acquired, snapshot paired_encoders via da_copy and addref each weak ref under the lock, then iterate outside it.
Bug 2 - freed encoder struct in gpu_encoders: gpu_encoders stored raw obs_encoder_t* pointers. If the last external strong ref was dropped before stop_gpu_encode() erased the entry, obs_encoder_destroy() could bfree() the encoder struct while the GPU thread read from it to reach context.control.
Fix: Change gpu_encoders to store obs_weak_encoder_t* (control struct) pointers. Take a weak-ref addref in start_gpu_encode and release it in stop_gpu_encode after os_event_wait(gpu_encode_inactive). The GPU thread calls obs_weak_encoder_get_encoder() directly, reading only from the control struct, never from the encoder struct.
Adds obs_encoder_get_weak_encoder_noref() helper to retrieve the control pointer without incrementing the weak refcount.
Motivation and Context
While working with multi-streaming in OBS and 3rd party plugins, I encountered a situation where closing several streams at nearly the same time caused an OBS crash. When investigating the 3rd-party code (Aitum Stream Suite, in particular) the location identified from crash dumps pointed to the correct fix being in OBS Studio. The crash was easily reproducible by starting streams to 3 different platforms, letting them run a minute or two at least, then stopping the streams via OBS. The typical result was an OBS crash prompting me if I'd like a crash dump copied to the pasteboard. It is the subsequent investigations that prompted me to produce this patch.
How Has This Been Tested?
My main stream is to Twitch (the default in OBS) and I've setup 3 other streaming platforms via the 3rd-party plugin. After building with this patch, I tried several times to reproduce the crash to no avail. Next, I ran a full day of streaming, stopping the 3 other streams mid-way through my main stream, then restarting them later, then stopping them again at the end of the streaming day. No crashes.
Types of changes
Bug fix (non-breaking change which fixes an issue)
Checklist: