Skip to content

Fix racing in shared memory creation #966

Merged
XiaohanZhangCMU merged 8 commits intomainfrom
xiaohan/fix-racing-in-shared-memory
Jan 23, 2026
Merged

Fix racing in shared memory creation #966
XiaohanZhangCMU merged 8 commits intomainfrom
xiaohan/fix-racing-in-shared-memory

Conversation

@XiaohanZhangCMU
Copy link
Collaborator

@XiaohanZhangCMU XiaohanZhangCMU commented Jan 22, 2026

Description of changes:

This PR hopefully addresses several issues raised in dataset creation for certain OS. The theory is that certain OS is slow in propagating shared memory across processes, so the non-local leaders need more wait time on the local leader.

Issue #, if available:

Merge Checklist:

Put an x without space in the boxes that apply. If you are unsure about any checklist, please don't hesitate to ask. We are here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the contributor guidelines
  • This is a documentation change or typo fix. If so, skip the rest of this checklist.
  • I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the MosaicML team.
  • I have updated any necessary documentation, including README and API docs (if appropriate).

Tests

  • I ran pre-commit on my change. (check out the pre-commit section of prerequisites)
  • I have added tests that prove my fix is effective or that my feature works (if appropriate).
  • I ran the tests locally to make sure it pass. (check out testing)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes.

@XiaohanZhangCMU XiaohanZhangCMU force-pushed the xiaohan/fix-racing-in-shared-memory branch from f34ba66 to dadd9c2 Compare January 22, 2026 22:48
@XiaohanZhangCMU XiaohanZhangCMU enabled auto-merge (squash) January 23, 2026 23:47
@XiaohanZhangCMU XiaohanZhangCMU merged commit 6a4e12f into main Jan 23, 2026
7 checks passed
@XiaohanZhangCMU XiaohanZhangCMU deleted the xiaohan/fix-racing-in-shared-memory branch January 23, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants