Skip to content

Conversation

@corpoverlords
Copy link
Contributor

@corpoverlords corpoverlords commented Oct 21, 2025

Rationale for this change

This would allow Arrow to coexist with other libraries that also use the AWS SDK. Internally since 2 years ago AWS SDK already has a refcount mechanism for InitAPI and supports re-init after deinit.

  • Introduced a mutex for thread-safe re-initialization of the S3 client after finalization, replacing the previous std::call_once mechanism.
  • Added an Initialize method to reset the finalized state of the S3 client.
  • Updated EnsureInitialized to allow re-initialization while ensuring thread safety.

This change improves the flexibility and safety of the S3 client lifecycle management.

What changes are included in this PR?

S3FS init changes

Are these changes tested?

Tested with our local infra. I can add a unit test in a dedicated cpp file (due to init/deinit usage) if wanted.

Are there any user-facing changes?

No

This PR includes breaking changes to public APIs. (If there are any breaking changes to public APIs, please explain which changes are breaking. If not, you can remove this.)

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

@github-actions
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@corpoverlords corpoverlords changed the title Allow S3 Filesystem Re-initialization GH-47904: [C++] [Python] Allow S3 Filesystem Re-initialization Oct 21, 2025
@github-actions
Copy link

⚠️ GitHub issue #47904 has been automatically assigned in GitHub to PR creator.

This would allow Arrow to coexist with other libraries that also use the AWS SDK. Internally since 2 years ago AWS SDK already has a refcount mechanism for `InitAPI` and supports re-init after deinit.

- Introduced a mutex for thread-safe re-initialization of the S3 client after finalization, replacing the previous std::call_once mechanism.
- Added an Initialize method to reset the finalized state of the S3 client.
- Updated EnsureInitialized to allow re-initialization while ensuring thread safety.

This change improves the flexibility and safety of the S3 client lifecycle management.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants