Skip to content

Conversation

@kbenoit
Copy link

@kbenoit kbenoit commented May 20, 2025

Add TBB (onetbb), needed for the quanteda package that uses these to build parallelism into the binary. Without it, the Mac build is single-threaded only.

@s-u
Copy link
Member

s-u commented Oct 24, 2025

@kbenoit unfortunately, oneTBB is not safe to use as static library, because they do not manage global state correctly. Could you use header-only concurrency implementation instead?

For posterity - here is the updated recipes, but for reasons above we don't want to include it.

Package: onetbb
Version: 2022.2.0
Source-URL: https://github.com/uxlfoundation/oneTBB/archive/refs/tags/v${ver}.tar.gz
Source-SHA256: f0f78001c8c8edb4bddc3d4c5ee7428d56ae313254158ad1eec49eced57f6a5b
Configure: -DTBB_TEST=OFF
Build-system: cmake

@s-u
Copy link
Member

s-u commented Oct 25, 2025

Also any chance you could use RcppParallel instead? It already provides TBB so that would be one way the ensure a shared instance.

@kbenoit
Copy link
Author

kbenoit commented Oct 26, 2025

Hi @s-u, that's what we did in earlier versions, but then moved to the recipe because it was too frustrating to deal with the disconnect between RcppParallel's bundled version and the test platform requirements used by CRAN. Maybe this has changed now though. @koheiw, any suggestions?

@koheiw
Copy link

koheiw commented Oct 27, 2025

The development version of RcppParallel has TBB 2022 according NEWS. On Windows, however, it uses TBB in Rtools just like we are doing. I (and probably RcppParallel developers) wish CRAN to have TBB header for Mac in a similar manner.

https://github.com/RcppCore/RcppParallel/blob/master/NEWS.md

@kevinushey
Copy link

Relatedly -- I'd like to get a new version of RcppParallel to CRAN, but doing so while also avoiding ABI breakage in existing R packages has proven challenging, as older versions of RcppParallel linked to some components of TBB directly.

This issue has been rectified in the most recent RcppParallel release, so I'm planning to submit a new version of RcppParallel (with a new version of TBB) sometime after packages have been rebuilt against the new RcppParallel.

@koheiw
Copy link

koheiw commented Oct 27, 2025

quanteda was one of the packages that directly call TBB components in RcppParallel such as tbb::parallel_for and tbb::concurrent_unordered_map etc. If it is discourage in the new version of RcppParallel, we need the header files in recipe (I tried to include the TBB header in quanteda itself, but I could not make it work).

@s-u
Copy link
Member

s-u commented Oct 28, 2025

Just to be clear:

  • we cannot add oneTBB to recipes, because it doesn't work as a static library due to its handling of globals (it even warns if you try to build it statically) - it will crash if it is linked more than once so that's a no-go.
  • the only way to use oneTBB is as a common shared library such that all globals are loaded only once.

Therefore there are only two options:

  • have one dedicated package to provide oneTBB - currently that is RcppParallel as it uses a shared build of oneTBB so it seems natural to have that one provide it to anyone else who needs access to it.
  • if possible (I don't know you'd have to find out) use header-only implementation of the features from oneTBB that you need. In that case you can simply include those headers in your package, there is no binary needed. However, from what I saw the commonly used things like concurrent containers may need the global scheduler in which case they cannot be used that way.

@koheiw
Copy link

koheiw commented Oct 29, 2025

Thank you for explaining. Now I understand why I could not compile our package with oneTBB headers. I agree that RcppParallel is the best package to have oneTBB. I hope @kevinushey and other to allow us to use its components directly.

@s-u
Copy link
Member

s-u commented Oct 29, 2025

@kevinushey Do you think you can get RcppParallel to be the canonical package to provide TBB for all others? Please keep me and Tomáš in the loop (by email) so we can synchronize accordingly.

@kevinushey
Copy link

I'm happy to do that, but the main challenge is dealing with ABI breakages when new versions of RcppParallel are released with new versions of TBB (for packages which depend on TBB directly, rather than via RcppParallel APIs). As an example, stringfish links to RcppParallel, but uses TBB APIs directly and so has

kevin@kbox:~/R/aarch64-unknown-linux-gnu-library/4.3/stringfish/libs
$ nm stringfish.so | c++filt | grep tbb::
00000000000606b0 b guard variable for _stringfish_is_tbb::stop_sym
0000000000035660 W tbb::interface5::internal::split_ordered_list<std::pair<RStringIndexer::rstring_info const, std::atomic<int> >, tbb::tbb_allocator<std::pair<RStringIndexer::rstring_info const, std::atomic<int> > > >::~split_ordered_list()
0000000000035660 W tbb::interface5::internal::split_ordered_list<std::pair<RStringIndexer::rstring_info const, std::atomic<int> >, tbb::tbb_allocator<std::pair<RStringIndexer::rstring_info const, std::atomic<int> > > >::~split_ordered_list()
0000000000036070 W tbb::interface5::internal::concurrent_unordered_base<tbb::interface5::concurrent_unordered_map_traits<RStringIndexer::rstring_info, std::atomic<int>, tbb::interface5::internal::hash_compare<RStringIndexer::rstring_info, rstring_info_hash, std::equal_to<RStringIndexer::rstring_info> >, tbb::tbb_allocator<std::pair<RStringIndexer::rstring_info const, std::atomic<int> > >, false> >::init_bucket(unsigned long)
< ... lots more stuff ... >

and so existing installations of stringfish will almost surely break when new versions of RcppParallel include newer releases of TBB. This also implies that reverse dependency checks for RcppParallel will likely fail on submission, since the packages being tested might not be rebuilt when tests are run.

Just to confirm this is true, here's what I see if I try to use a version of stringfish that was built against RcppParallel 5.1.11-1 (current CRAN release) but with the development version of RcppParallel installed:

> library(stringfish)
Error: package or namespace load failed for 'stringfish' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/home/kevin/R/aarch64-unknown-linux-gnu-library/4.3/stringfish/libs/stringfish.so':
  /home/kevin/R/aarch64-unknown-linux-gnu-library/4.3/stringfish/libs/stringfish.so: undefined symbol: _ZN3tbb8internal25concurrent_vector_base_v316internal_grow_byEmmPFvPvPKvmES4_

Some recent changes in the development version of RcppParallel have it statically linking to TBB; we'll need to consider if that's really safe to do. IIRC this was requested by Tomas as well, so that RcppParallel could use the version of TBB provided by Rtools.

https://github.com/RcppCore/RcppParallel/pull/241/files

@s-u
Copy link
Member

s-u commented Oct 30, 2025

@kevinushey This happens occasionally (e.g. Matrix was changing its ABI) - you just have to synchronize with CRAN to make sure we re-build the binaries in that case (Uwe+I). (When you write to CRAN, please always CC me).

@s-u
Copy link
Member

s-u commented Oct 30, 2025

@kevinushey Re static linking - it is not safe, I'd much rather that RcppParallel was the official distribution of shared TBB, but we should discuss all this via the official channels so it doesn't get lost - this issue is not the right place for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants