OSU Micro Benchmarks v5.1 (11/10/15) * New Features & Enhancements - Introduce non-blocking collective v-variants as well as ialltoallw * osu_iallgatherv * osu_ialltoallv * osu_igatherv * osu_iscatterv * osu_ialltoallw - Add support for benchmarking GPU-Aware non-blocking collectives. Overlap can be computed using either CPU or GPU kernels * osu_iallgather * osu_iallgatherv * osu_ialltoall * osu_ialltoallv * osu_ialltoallw * osu_ibcast * osu_igather * osu_igatherv * osu_iscatter * osu_iscatterv - Allow users the ability to specify zero warmup iterations * Bug Fixes - fix openacc pragma OSU Micro Benchmarks v5.0 (08/17/15) * New Features & Enhancements - Support for a set of non-blocking collectives. The benchmarks can display both the amount of time spent in the collectives and the amount of overlap achievable * osu_iallgather * osu_ialltoall * osu_ibarrier * osu_ibcast * osu_igather * osu_iscatter - Add startup benchmarks to facilitate the ability to measure the amount of time it takes for an MPI library to complete MPI_Init * osu_init * osu_hello - Allocate and align data dynamically - Thanks to Devendar Bureddy from Mellanox for the suggestion - Add options for number of warmup iterations [-x] and number of iterations used per message size [-i] to MPI benchmarks - Thanks to Devendar Bureddy from Mellanox for the suggestion * Bug Fixes - Do not truncate user specified max memory limits - Thanks to Devendar Bureddy from Mellanox for the report and patch OSU Micro Benchmarks v4.4.1 (10/30/14) * Bug Fixes - adding missing MPI3 guard for WIN_ALLOCATE - capture getopt return value in an int instead of char OSU Micro Benchmarks v4.4 (8/23/14) * New Features & Enhancements - Support for MPI-3 RMA (one-sided) and atomic operations using GPU buffers * osu_acc_latency * osu_cas_latency * osu_fop_latency * osu_get_bw * osu_get_latency * osu_put_bibw * osu_put_bw * osu_put_latency * Bug Fixes - remove use of AC_FUNC_MALLOC to avoid undefined rpl_malloc reference - add missing upc benchmarks for make dist rule OSU Micro Benchmarks v4.3.1 (6/20/14) * Bug Fixes - Fix typo in MPI collective benchmark help message - Explicitly mention that -m and -M parameters are specified in bytes OSU Micro Benchmarks v4.3 (3/24/14) * New Features & Enhancements - This new suite includes several new (or updated) benchmarks to measure performance of MPI-3 RMA communication operations with options to select different window creation (WIN_CREATE, WIN_DYNAMIC, and WIN_ALLOCATE) and synchronization functions (LOCK, PSCW, FENCE, FLUSH, FLUSH_LOCAL, and LOCK_ALL) in each benchmark * osu_acc_latency * osu_cas_latency * osu_fop_latency * osu_get_acc_latency * osu_get_bw * osu_get_latency * osu_put_bibw * osu_put_bw * osu_put_latency - New UPC Collective Benchmarks * osu_upc_all_barrier * osu_upc_all_broadcast * osu_upc_all_exchange * osu_upc_all_gather * osu_upc_all_gather_all * osu_upc_all_reduce * osu_upc_all_scatter - Build MPI3 benchmarks when MPI library support is detected * Bug Fixes - Add shmem_quiet() in OpenSHMEM Message Rate benchmark to ensure all previously issued operations are completed - Allocate pWrk from symmetric heap in OpenSHMEM Reduce benchmark OSU Micro Benchmarks v4.2 (11/08/13) * New Features & Enhancements - New OpenSHMEM benchmarks * osu_oshm_fcollect - Enable handling of GPU device buffers in all MPI collective benchmarks - Add device binding for OpenACC benchmarks * Bug Fixes - Add upc_fence after memput in osu_upc_memput benchmark - Correct CUDA configuration example in README - Fix several warnings OSU Micro Benchmarks v4.1 (8/24/13) * New Features & Enhancements - New OpenSHMEM benchmarks * osu_oshm_barrier * osu_oshm_broadcast * osu_oshm_collect * osu_oshm_reduce - New MPI-3 RMA Atomics benchmarks * osu_cas_flush * osu_fop_flush OSU Micro Benchmarks v4.0.1 (5/06/13) * Bug Fixes - Fix several warnings OSU Micro Benchmarks v4.0 (4/16/13) * New Features & Enhancements - Support buffer allocation using OpenACC and CUDA in osu_alltoall, osu_gather, and osu_scatter benchmarks - Limit amount of memory allocated by collective benchmarks dynamically based on number of processes - Memory limit can also be explicitly set by the user through the -m option - Support for 64-bit atomic operations in osu_oshm_atomics * Bug Fixes - Fix numerical overflow error with reporting bandwidth in osu_mbw_mr OSU Micro Benchmarks v3.9 (2/28/13) * New Features & Enhancements - Support buffer allocation using OpenACC in GPU benchmarks - Use average time instead of max time for calculating the bandwidth and message rate in osu_mbw_mr - Thanks to Alex Mikheev from Mellanox for the patch * Bug Fixes - Properly initialize host buffers for DH and HD transfers in GPU benchmarks OSU Micro Benchmarks v3.8 (11/07/12) * New Features & Enhancements - New UPC benchmarks * osu_upc_memput * osu_upc_memget OSU Micro Benchmarks v3.7 (9/07/12) * New Features & Enhancements - New OpenSHMEM benchmarks * osu_oshm_get * osu_oshm_put_mr * osu_oshm_atomics * osu_oshm_put - Organize installation directory according to benchmark type * Bug Fixes - Destroy cuda context before exiting OSU Micro Benchmarks v3.6 (4/30/12) * New Features & Enhancements - New collective benchmarks * osu_allgather * osu_allgatherv * osu_allreduce * osu_alltoall * osu_alltoallv * osu_barrier * osu_bcast * osu_gather * osu_gatherv * osu_reduce * osu_reduce_scatter * osu_scatter * osu_scatterv * Bug Fixes - Fix GPU binding issue when running with HH mode OSU Micro Benchmarks v3.5.2 (3/22/12) * Bug Fixes - Fix typo which led to use of incorrect buffers OSU Micro Benchmarks v3.5.1 (2/02/12) * New Features & Enhancements - Provide script to set GPU affinity for MPI processes * Bug Fixes - Removed GPU binding after MPI_Init to avoid switching context OSU Micro Benchmarks v3.5 (11/09/11) * New Features & Enhancements - Extension of osu_latency, osu_bw, and osu_bibw benchmarks to evaluate the performance of MPI_Send/MPI_Recv operation with NVIDIA GPU device and CUDA support - This functionality is exposed when configured with --enable-cuda option - Flexibility for using buffers in NVIDIA GPU device (D) and host memory (H) - Flexibility for selecting data movement between D->D, D->H and H->D OSU Micro Benchmarks v3.4 (09/13/11) * New Features & Enhancements - Add passive one-sided communication benchmarks - Update one-sided communication benchmarks to provide shared memory hint in MPI_Alloc_mem calls - Update one-sided communication benchmarks to use MPI_Alloc_mem for buffer allocation - Give default values to configure definitions (can now build directly with mpicc) - Update latency benchmarks to begin from 0 byte message * Bug Fixes - Remove memory leaks in one-sided communication benchmarks - Update benchmarks to touch buffers before using them for communication - Fix osu_get_bw test to use different buffers for concurrent communication operations - Fix compilation warnings