Skip to content

Add python bindings to the global thread pool functionality #24238

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

khoover
Copy link

@khoover khoover commented Mar 28, 2025

Description

Allows users to configure and enable the global thread pool via Python, and have inference sessions use it instead of session-local thread pools.

Motivation and Context

Forked off of #23495 to take over implementation, see issue #23523.

Our particular use case involves a single service instance serving thousands of individual models, each relatively small (e.g. small decision trees). Creating individual services for each model is too much overhead, and attempting to start several thousand thread-pools is a non-starter. We could possibly have each session be single-threaded, but we would like to be able to separate the request handler thread count from the compute thread count (e.g. 2 handler threads but 4 intra-op ones).

@snnn
Copy link
Member

snnn commented Mar 28, 2025

We don't have a good way to shutdown the thread pool. It should be done before the python process started to exit and unload DLLs. So this doc: https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-best-practices#deadlocks-caused-by-lock-order-inversion
So, generally speaking, if you have a thread pool and you declared it as a global var, then you must manually shutdown it, otherwise it will cause a deadlock. The issue was highly reproducible. However, recently probably the restriction is gone? I am not sure. But the Windows doc is not updated. So I am not very sure if this will work.

@snnn
Copy link
Member

snnn commented Mar 28, 2025

Like this: apache/mxnet#11163

@khoover
Copy link
Author

khoover commented Apr 17, 2025

@microsoft-github-policy-service agree company="Instacart"

@khoover khoover force-pushed the add-python-global-thread-pool branch from 6a2b524 to 44d5cf6 Compare April 17, 2025 23:27
@khoover
Copy link
Author

khoover commented Apr 18, 2025

We don't have a good way to shutdown the thread pool. It should be done before the python process started to exit and unload DLLs. So this doc: https://learn.microsoft.com/en-us/windows/win32/dlls/dynamic-link-library-best-practices#deadlocks-caused-by-lock-order-inversion So, generally speaking, if you have a thread pool and you declared it as a global var, then you must manually shutdown it, otherwise it will cause a deadlock. The issue was highly reproducible. However, recently probably the restriction is gone? I am not sure. But the Windows doc is not updated. So I am not very sure if this will work.

Would adding onnxruntime::Environment::ShutdownGlobalThreadPools and then hooking it into atexit in Python work? I imagine the existing C++ API has examples of needing to do this as well, no?

EDIT: I see, the native implementation would just let Environment destruct at the end of the program, and that handles shutting down. But the Python bindings have a static shared_ptr<Environment>, and it's really just the thread pools that need to go.

@khoover khoover force-pushed the add-python-global-thread-pool branch from 44d5cf6 to 510733f Compare April 18, 2025 16:47
@khoover khoover marked this pull request as ready for review April 18, 2025 22:36
…tting the global threading options
@snnn
Copy link
Member

snnn commented Apr 21, 2025

On Windows when a thread shuts down, the OS also needs to notify all loaded DLLs in case if they need to do any cleanup work. It cannot be too late. So, in general all user created threads should be shutdown before unloading DLLs. Theoretically speaking, there is another way of doing this: the cleanup functions should test if the currently the process is shutting down, if yes, giving up doing any clean up. We do not have such logics in ORT yet. We may consider doing it later.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants