diff --git a/content/backmatter.tex b/content/backmatter.tex index dbd707baa..988a10308 100644 --- a/content/backmatter.tex +++ b/content/backmatter.tex @@ -184,7 +184,195 @@ \chapter{Undefined Behavior in OpenSHMEM}\label{sec:undefined} \end{longtable} +\color{ForestGreen} +\chapter{Interoperability with other Programming Models}\label{sec:interoperability} + +OpenSHMEM routines may be used in conjunction with the routines of other +communication libraries or parallel languages in the same program. This section +describes the interoperability with other programming models, including +clarification of undefined behaviors caused by mixed use of different models, +advice to \openshmem library users and developers that may improve the portability +and performance of hybrid programs, and definition of an OpenSHMEM +API that queries the interoperability features provided by an \openshmem library. + + +\section{MPI Interoperability} + +\openshmem and MPI are two commonly used parallel programming models for +distributed-memory systems. The user can choose to utilize both models in the same program +to efficiently and easily support various communication patterns. + +A vendor may implement the \openshmem and MPI libraries in different ways. For +instance, one may implement both \openshmem and MPI as standalone libraries, +each of which allocates and initializes fully isolated communication +resources. +As the other common approach, however, +a vendor may implement both \openshmem and MPI interfaces within the +same software system in order to share a communication resource when possible. + +To improve interoperability and portability in \openshmem + MPI hybrid +programming, we clarify the relevant semantics in the following subsections. + + +\subsection{Initialization} +To ensure that a hybrid program can be portably performed with different vendor +implementations, the \openshmem environment of the program must be initialized by +a call to \FUNC{shmem\_init} or \FUNC{shmem\_init\_thread} and be finalized by +a call to \FUNC{shmem\_finalize}; the MPI environment of the program must be initialized +by a call to \FUNC{MPI\_Init} or \FUNC{MPI\_Init\_thread} and be finalized by a +call to \FUNC{MPI\_Finalize}. + +\apiimpnotes{ +Portable implementations of OpenSHMEM and MPI must ensure that the initialization +calls can be made in an arbitrary order within a program; the same rule also +applies to the finalization calls. A software runtime that utilizes a shared +communication resource for \openshmem and MPI communication may maintain an +internal reference counter in order to ensure that the shared resource is +initialized only once and thus no shared resource is released until the last +finalization call is made. +} + + +\subsection{Dynamic Process Creation and MPMD Programming} +\label{subsec:interoperability:mpmd} + +MPI defines a dynamic process model that allows creation of processes after +an MPI application has started (e.g., by calling \FUNC{MPI\_Comm\_spawn}) and +connection to independent processes (e.g., through \FUNC{MPI\_Comm\_accept} +and \FUNC{MPI\_Comm\_connect}) +and provides a mechanism to establish communication +between the newly created processes and the existing MPI application (see +MPI standard version 3.1, Chapter 10). +Unlike MPI, \openshmem starts all processes at once and requires all PEs to +collectively allocate and initialize resources (e.g., symmetric heap) used by +the \openshmem library before any other \openshmem routine may +be called. Communicating with a dynamically created process in the \openshmem +environment may result in undefined behavior. +Hence, users should not use \openshmem and MPI dynamic process model +in the same program. + + +\subsection{Thread Safety} +\label{subsec:interoperability:thread} +Both \openshmem and MPI define the interaction with user threads in a program +with routines that can be used for initializing and querying the thread +environment. In a hybrid program, the user may request different thread levels +at the initialization calls of \openshmem and MPI environments; however, the +returned support level provided by the \openshmem library might be different +from that returned in an \openshmem-only program. For instance, the former +initialization call in a hybrid program may initialize a resource with the +user-requested thread level, but the supported level cannot be updated by the latter +initialization call if the underlying software runtime of \openshmem and MPI +share the same internal communication resource. +The program should always check the \VAR{provided} thread level returned +at the corresponding initialization call or query the level of thread support +after initialization to portably ensure thread support in each communication +environment. + +Both \openshmem and MPI define similar thread levels, namely, \VAR{THREAD\_SINGLE}, +\VAR{THREAD\_FUNNELED}, \VAR{THREAD\_SERIALIZED}, and \VAR{THREAD\_MULTIPLE}. +When requesting threading support in a hybrid program, however, +users should follow additional rules as described below. +\begin{itemize} + \item The \VAR{THREAD\_SINGLE} thread level requires a single-threaded program. + Hence, users should not request \VAR{THREAD\_SINGLE} at the initialization + call of either \openshmem or MPI but request a different thread level at the + initialization call of the other model in the same program. + + \item The \VAR{THREAD\_FUNNELED} thread level allows only the main thread to + make communication calls. A hybrid program using the \VAR{THREAD\_FUNNELED} + thread level in both \openshmem and MPI should ensure the same main thread + is used in both communication environments. + + \item The \VAR{THREAD\_SERIALIZED} thread level requires the program to ensure + communication calls are not made concurrently by multiple threads. A hybrid + program should ensure serialized calls to both \openshmem and MPI libraries, + if the program uses \VAR{THREAD\_SERIALIZED} in one communication environment + and \VAR{THREAD\_SERIALIZED} or \VAR{THREAD\_FUNNELED} in the other one. +\end{itemize} + +\subsection{Mapping Process Identification Numbers} +\label{subsec:interoperability:id} + +Similar to the PE identifier in \openshmem, MPI defines rank as the +identification number of a process in a communicator. Both \openshmem PE +and MPI rank are unique integers assigned from zero to one less than the total +number of processes. In a hybrid program, the \openshmem +PE and the MPI rank in \VAR{MPI\_COMM\_WORLD} of a process can be equal. +This feature, however, may be provided by only some of the \openshmem and MPI +implementations (e.g., if both environments share the same underlying process +manager) and is not portably guaranteed. A portable program should always +use the standard functions in each model, namely, \FUNC{shmem\_my\_pe} in \openshmem +and \FUNC{MPI\_Comm\_rank} in MPI, to query the process identification numbers +in each communication environment and manage the mapping of identifiers in the +program when necessary. + +\subsubsection{Example} +\label{subsubsec:interoperability:id:example} +The following example demonstrates how to manage the mapping between \openshmem +PE identifier and MPI ranks in \VAR{MPI\_COMM\_WORLD} in a hybrid \openshmem +and MPI program. + +\lstinputlisting[language={C}, tabsize=2, + basicstyle=\ttfamily\footnotesize] + {example_code/hybrid_mpi_mapping_id.c} + +\subsection{RMA Programming Models} +\label{subsec:interoperability:rma} + +Both \openshmem and MPI define similar RMA and atomic operations for remote memory +access, however, a portable program should not assume interoperability between these +two RMA models. +For instance, \openshmem guarantees the atomicity only of concurrent \openshmem AMO operations +that operate on symmetric data with the same datatype. Access to the same symmetric +object with MPI atomic operations, such as an \FUNC{MPI\_Fetch\_and\_op}, may +result in an undefined result. Furthermore, +because most RMA programs can be written using either \openshmem or MPI RMA, +users should choose only one of the RMA models in the same program, whenever +possible, for performance and code simplicity. + +\subsection{Communication Progress} +\label{subsec:interoperability:progress} + +\openshmem promises the progression of communication both with and without +\openshmem calls and requires the software progress mechanism in the implementation +(e.g., a progress thread) when the hardware does not provide asynchronous communication +capabilities. In MPI, however, a weak progress semantics is applied. That is, +an MPI communication call is guaranteed only to complete in finite time. For +instance, an \FUNC{MPI\_Put} may be completed only when the remote process makes an MPI +call that internally triggers the progress of MPI, if the underlying hardware +does not support asynchronous communication. A hybrid program +should not assume that the \openshmem library also makes progress for MPI. +A call to \FUNC{shmem\_query\_interoperability} with the \VAR{SHMEM\_PROGRESS\_MPI} +property (see definition in \ref{subsec:interoperability:query}) +can be used to portably check whether the implementation provides asynchronous +progression also for MPI. If it is not provided, the user program may have to +explicitly manage the asynchronous communication in MPI in +order to prevent any deadlock or performance degradation. + +\apiimpnotes{ +Implementations that provide both \openshmem and MPI interfaces should try +to ensure progress for both models when necessary and possible, for performance +reasons. For instance, an implementation may start making progress for +both \openshmem and MPI whenever possible, after the user program has called +\FUNC{shmem\_init} and \FUNC{MPI\_init} provided by the same system. +} + + +\section{Query Interoperability} + +A hybrid user program can query the interoperability feature of an \openshmem +implementation in order to avoid unnecessary overhead and programming complexity. +For instance, the user program can eliminate manual progress polling for MPI +communication if the underlying software runtime guarantees the progression of +communication also for MPI even without explicit function calls. + +\subsection{\textbf{SHMEM\_QUERY\_INTEROPERABILITY}} +\label{subsec:interoperability:query} +\input{content/shmem_query_interoperability} + +\color{black} \chapter{History of OpenSHMEM}\label{sec:openshmem_history} diff --git a/content/shmem_query_interoperability.tex b/content/shmem_query_interoperability.tex new file mode 100644 index 000000000..8af1e26ca --- /dev/null +++ b/content/shmem_query_interoperability.tex @@ -0,0 +1,39 @@ +\apisummary{ + Determines whether an interoperability feature is supported by the \openshmem + library implementation. +} +\begin{apidefinition} + +\begin{Csynopsis} +int @\FuncDecl{shmem\_query\_interoperability}@(int property); +\end{Csynopsis} + +\begin{apiarguments} + \apiargument{IN}{property}{The interoperability property queried by the user.} +\end{apiarguments} + +% compiling error ? +% \apidescription{ +\FUNC{shmem\_query\_interoperability} queries whether an interoperability property +is supported by the \openshmem library. One of the following properties can be +queried in an \openshmem program after finishing the +initialization call to \openshmem and that of the relevant programming models +being used in the program. An \openshmem library implementation may extend the +available properties. + +\begin{itemize} +\item \VAR{SHMEM\_PROGRESS\_MPI} Query whether the \openshmem +implementation makes progress for the MPI communication used in the user program. +\end{itemize} +% } + +\apireturnvalues{ + The return value is \CONST{1} if \VAR{property} is supported by the \openshmem library; + otherwise, it is \CONST{0}. +} +\end{apidefinition} + +\apiimpnotes{ +Implementations that do not support interoperability with other programming models +may simply return \CONST{0} for the relevant interoperability query. +} diff --git a/example_code/hybrid_mpi_mapping_id.c b/example_code/hybrid_mpi_mapping_id.c new file mode 100644 index 000000000..c72168d6e --- /dev/null +++ b/example_code/hybrid_mpi_mapping_id.c @@ -0,0 +1,36 @@ +#include +#include +#include +#include + +int main(int argc, char *argv[]) +{ + static long pSync[SHMEM_COLLECT_SYNC_SIZE]; + for (int i = 0; i < SHMEM_COLLECT_SYNC_SIZE; i++) + pSync[i] = SHMEM_SYNC_VALUE; + + MPI_Init(&argc, &argv); + shmem_init(); + + int mype = shmem_my_pe(); + int npes = shmem_n_pes(); + + static int myrank; + MPI_Comm_rank(MPI_COMM_WORLD, &myrank); + + int *mpi_ranks = shmem_calloc(npes, sizeof(int)); + + shmem_sync_all(); + shmem_collect32(mpi_ranks, &myrank, 1, 0, 0, npes, pSync); + + if (mype == 0) + for (int i = 0; i < npes; i++) + printf("PE %d's MPI rank is %d\n", i, mpi_ranks[i]); + + shmem_free(mpi_ranks); + + shmem_finalize(); + MPI_Finalize(); + + return 0; +}