Summary:
Problem
Currently, the entrypoint for in-place Python binaries (i.e. built with dev mode) executes the following steps to load system native dependencies (e.g. sanitizers and allocators):
- Backup
LD_PRELOAD
set by the caller - Append system native dependencies to
LD_PRELOAD
- Inject a prologue in user code which restores
LD_PRELOAD
set by the caller -
execv
Python interpreter
The steps work as intended for single process Python programs. However, when a
Python program spawns child processes, the child processes will not load native
dependencies, since they simply execv
's the vanilla Python interpreter. A few
examples why this is problematic:
- The ASAN runtime library is a system native dependency. Without loading it, a
child process that loads user native dependencies compiled with ASAN will
crash during static initialization because it can't find
_asan_init
. -
jemalloc
is also a system native dependency.
Many if not most ML use cases "bans" dev mode because of these problems. It is very unfortunate considering the developer efficiency dev mode provides. In addition, a huge amount of unit tests have to run in a more expensive build mode because of these problems.
For an earlier discussion, see this post.
Solution
Move the system native dependencies loading logic out of the Python binary
entrypoint into an interpreter wrapper, and set the interpreter as
sys.executable
in the injected prologue:
- The Python binary entrypoint now uses the interpreter wrapper, which has the same command line interface as the Python interpreter, to run the main module.
-
multiprocessing
'sspawn
method now uses the interpreter wrapper to create child processes, ensuring system native dependencies get loaded correctly.
Alternative Considered
One alternative considered is to simply not removing system native dependencies
from LD_PRELOAD
, so they are present in the spawned processes. However, this
causes some linking issues, which were perhaps the reason LD_PRELOAD
was
restored in the first place: in-place Python binaries have access to binaries
install on devservers that are not built with the target platform (e.g.
/bin/sh
which is used by some Python standard libraries). These binaries does
not link properly with the system native dependencies.
References
An old RFC for this change: D16210828 The counterpart for opt mode: D16350169