Parse files in parallel when possible#21175
Parse files in parallel when possible#21175ilevkivskyi wants to merge 3 commits intopython:masterfrom
Conversation
This comment has been minimized.
This comment has been minimized.
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
| # TODO: we should probably use psutil instead. | ||
| # With psutil we can get a number of physical cores, while all stdlib | ||
| # functions include virtual cores (which is not optimal for performance). | ||
| available_threads = os.cpu_count() or 2 # conservative fallback |
There was a problem hiding this comment.
Yes, len(psutil.Process().cpu_affinity()) is better everywhere except Darwin/macOS, where psutil doesn't support that; though I still suggest taking the minimum of that and os.cpu_count() as the later respects -X cpu_count and/or PYTHON_CPU_COUNT for Python 3.13+ users (especially containerized users).
If you don't want to add a psutil dependency yet, I recommend os.sched_getaffinity(0) which is how os.process_cpu_count() is implemented on Python 3.13+. (you should also still call os.cpu_count() and use it if it is smaller, for the same reasons as above).
There was a problem hiding this comment.
Yeah, I tried sched_getaffinity() but it is not available on Python 3.10 (which we still support). I guess we may need to write a separate helper with various fallbacks logic to make this ~reliable.
There was a problem hiding this comment.
Huh, I see os.sched_getaffinity() all the way back to Python 3.3: https://docs.python.org/3.3/library/os.html#os.sched_getaffinity
However,
They are only available on some Unix platforms
So maybe your platform didn't implement it until a later Python version.
Yeah, helper function + memoization is very helpful here
The idea is simple: new parser doesn't need the GIL, so we can parse files in parallel. Not sure why, but the most I see is ~4-5x speed-up with 8 threads, if I add more threads, it doesn't get visibly faster (I have 16 physical cores).
Some notes on implementation:
ThreadPoolExecutor, it seems to work OK.parse_file()a bit, so that we can parallelize (mostly) just the actual parsing. I see measurable degradation if I try to parallelize all ofparse_file().psutilbecause it is an optional dependency. We may want to actually make it a required dependency at some point.ast_serializeto beNonesometimes in some threads. I simply add an ugly workaround for now.cc @JukkaL