gh-116738: Use PyMutex for lzma module #140711

yoney · 2025-10-28T16:09:04Z

Similar to #140555, the main goal was to review the lzma module for free-threading. The methods already use a lock, which makes them thread-safe in a free-threaded build. I replaced PyThread_acquire_lock with PyMutex. PyMutex releases the GIL when the thread is parked. This change removes some macros and allocation handling code.

cc: @mpage @colesbury @emmatyping

Issue: Audit all built-in modules for thread safety #116738

ashm-dev · 2025-10-28T17:04:16Z

Lib/test/test_free_threading/test_lzma.py

+        def worker():
+            # it should return empty bytes as it buffers data internally
+            data = lzc.compress(INPUT)
+            self.assertEqual(data, b"")


The assertion self.assertEqual(data, b"") is flaky. In free-threaded mode, compress() may return data chunks non-deterministically due to race conditions in internal buffering.

@ashm-dev Thanks for your comment. I’m trying to verify/test the mutex is protecting the internal state and buffering, so there shouldn’t be a race condition. Could you please explain which race condition you mean? That would help me understand your point better.

@ashm-dev Are you using ChatGPT or another LLM to review for you? If so, please don't -- it's not helpful. If not, please try to be clearer in your responses.

ashm-dev · 2025-10-28T17:05:00Z

Lib/test/test_free_threading/test_lzma.py

+        def worker():
+            data = lzd.decompress(compressed, chunk_size)
+            self.assertEqual(len(data), chunk_size)
+            output.append(data)


output.append(data) without synchronization causes race conditions in free-threaded mode, potentially losing data or corrupting the list.

output.append(data) without synchronization causes race conditions in free-threaded mode, potentially losing data or corrupting the list.

@ashm-dev list is thread safe in free-threaded build.

In the free-threaded build, list operations use internal locks to avoid crashes, but thread safety isn’t guaranteed for concurrent mutations — see Python free-threading HOWTO.

ashm-dev · 2025-10-28T17:06:31Z

Lib/test/test_free_threading/test_lzma.py

+
+        def worker():
+            data = lzd.decompress(compressed, chunk_size)
+            self.assertEqual(len(data), chunk_size)


self.assertEqual(len(data), chunk_size) is wrong. decompress() may return less than max_length bytes.

@ashm-dev I agree that decompress() can return less than max_length if there isn’t enough input. In this test, I’m providing input that should produce at least max_length bytes. Is there anything else I might be missing? If I give enough valid input, is there any reason why lzma wouldn’t return max_length?

There are other tests making similar assumptions.

cpython/Lib/test/test_lzma.py

Lines 164 to 169 in ce4b0ed

# Feed first half the input

len_ = len(COMPRESSED_XZ) // 2

out.append(lzd.decompress(COMPRESSED_XZ[:len_],

max_length=max_length))

self.assertFalse(lzd.needs_input)

self.assertEqual(len(out[-1]), max_length)

pythongh-116738: Use PyMutex for lzma module

ef23321

bedevere-app bot mentioned this pull request Oct 28, 2025

Audit all built-in modules for thread safety #116738

Open

yoney marked this pull request as ready for review October 28, 2025 16:39

bedevere-app bot added the awaiting review label Oct 28, 2025

ashm-dev reviewed Oct 28, 2025

View reviewed changes

mpage requested review from colesbury, emmatyping and mpage October 28, 2025 20:21

mpage added the skip news label Oct 28, 2025

kumaraditya303 added the topic-free-threading label Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

gh-116738: Use PyMutex for lzma module #140711

gh-116738: Use PyMutex for lzma module #140711

yoney commented Oct 28, 2025 •

edited

Loading

Uh oh!

ashm-dev Oct 28, 2025

Uh oh!

yoney Oct 28, 2025

Uh oh!

ZeroIntensity Oct 28, 2025

Uh oh!

ashm-dev Oct 28, 2025

Uh oh!

yoney Oct 28, 2025

Uh oh!

ashm-dev Oct 28, 2025

Uh oh!

ashm-dev Oct 28, 2025 •

edited

Loading

Uh oh!

yoney Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	# Feed first half the input
	len_ = len(COMPRESSED_XZ) // 2
	out.append(lzd.decompress(COMPRESSED_XZ[:len_],
	max_length=max_length))
	self.assertFalse(lzd.needs_input)
	self.assertEqual(len(out[-1]), max_length)

Uh oh!

Uh oh!

gh-116738: Use PyMutex for lzma module #140711

Are you sure you want to change the base?

gh-116738: Use PyMutex for lzma module #140711

Conversation

yoney commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ZeroIntensity Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ashm-dev Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yoney commented Oct 28, 2025 •

edited

Loading

ashm-dev Oct 28, 2025 •

edited

Loading