The Python/C++ integration tests both test PREMUL_SUM ops for AllReduce, but there's no equivalent tests for Reduce or ReduceScatter.
Is there any reason for this, or is it just an oversight? I'll submit a PR with the tests if there's no reason to avoid them.