llnl · pearce8 · Apr 23, 2026 · Apr 21, 2026 · Apr 21, 2026 · Apr 23, 2026
diff --git a/docs/72_phloem/phloem.rst → docs/70_phloem/phloem.rst b/docs/72_phloem/phloem.rst → docs/70_phloem/phloem.rst
diff --git a/docs/70_omb/omb.rst → docs/71_omb/omb.rst b/docs/70_omb/omb.rst → docs/71_omb/omb.rst
diff --git a/docs/72_smb/smb.rst b/docs/72_smb/smb.rst
@@ -0,0 +1,149 @@
+*************************************
+Sandia Microbenchmarks - Message rate
+*************************************
+
+SMB Message Rate - A multi-node MPI point-to-point benchmark.  https://github.com/sandialabs/SMB.
+
+Purpose
+=======
+Sandia Microbenchmarks - Message Rate is a realistic messaging benchmark designed to emulate real application behavior. In particular there are a few things that set this benchmark apart; 1. It uses a peer count to emulate different communication patterns. 2. It clears the cache between each iteration to get realistic performance numbers. 3. It test different program behaviors, such at pre-posting receives, to evaluate performance under different scenarios. 
+
+Characteristics
+===============
+
+Problem
+-------
+
+The SMB implements four different communication patterns; single direction, pair based, pre-posted, and all-start. Each of these is a variation of behavior in a given communication pattern. For simplicity we're limiting this evaluation to pre-posted. 
+
+    -p <num>     Number of peers used in communication
+    -i <num>     Number of iterations per test
+    -m <num>     Number of messages per peer per iteration
+    -s <size>    Number of bytes per message
+    -c <size>    Cache size in bytes
+    -n <ppn>     Number of procs per node
+    -o           Format output to be machine readable
+
+Figure of Merit
+---------------
+As this is meant to represent a variety of application behaviors there isn't a single figure of merrit we can identify. However, we can identify a subset of input parameters to test. 
+For the purposes of this test the figure of merit is the message rate of pre-posted across different message sizes, and a number of peer count.
+
+
+Source code modifications
+=========================
+
+Please see :ref:`GlobalRunRules` for general guidance on allowed modifications. 
+
+Building
+========
+Accessing the sources
+
+* Clone the submodule from the benchmarks repository checkout 
+
+.. code-block:: bash
+
+   cd <path to benchmarks>
+   git submodule update --init --recursive
+   cd SMB/src/msgrate/
+
+..
+
+Build requirements:
+
+* C/C++ compiler(s) with support for C11 and C++14.
+
+* MPI 3.0+
+
+  * `OpenMPI 1.10+ <https://www.open-mpi.org/software/ompi/>`_
+  * `mpich <http://www.mpich.org>`_
+
+
+.. code-block:: bash
+
+   cd <path/to/smb> 
+   make -j
+
+.. 
+
+Testing the build:
+
+.. code-block:: bash
+
+    mpirun -n 8 msgrate -n 1
+
+.. 
+
+You should see output similar to the following (note, because you're presumably testing on a single node at this point, the -n parameter need to be set to 1. While this is erroneous from a performance standpoint, the SMB tries to ensure all communication is done across the network, and thus can't be run on a single node. 
+
+.. code-block:: bash
+
+    job size:   8
+    npeers:     6
+    niters:     4096
+    nmsgs:      128
+    nbytes:     8
+    cache size: 8388608
+    ppn:        1
+    single direction: 2578047.02
+    pair-based: 4343577.14
+      pre-post: 1889840.49
+     all-start: 2398236.06
+
+..
+
+
+Running
+=======
+We have two tests using SMB message rate, that we will describe here. The first is a based on a 2D 9-point stensil code and the second is a 3D 27-point stensil. 
+Each of these needs to be run for various message sizes and scales to test the performance of the entire system.
+
+
+We define some system specific variables for these tests.
+
+* PPN - the number of processes per node.
+* CACHE - 2x the size of the largest cache size (note: we use 2x here to be thorough)
+
+Note: these tests can be memory intensive, with memory usage growing as $O(message\_size*number\_of\_messages*number\_of\_peers*processes\_per\_node)$. If memory issues occur, use the -m flag to reduce the number of messages per iteration at higher message sizes. 
+
+* 9 point stencil
+
+.. code-block:: bash
+
+   for i in {0..24}; do mpirun msgrate -n $PPN -p 8 -c $CACHE -s $((2**i)) -o; done
+
+..
+
+
+* 27 point stencil
+
+.. code-block:: bash
+
+   for i in {0..24}; do mpirun msgrate -n $PPN -p 26 -c $CACHE -s $((2**i)) -o; done
+
+..
+
+Results from SMB are provided on the following systems:
+
+Validation
+==========
+
+
+Example Scalability Results
+===========================
+
+
+Memory Usage
+============
+
+
+Strong Scaling on El Capitan
+============================
+
+
+Weak Scaling on El Capitan
+==========================
+
+
+References
+==========
diff --git a/docs/71_gpcnet/gpcnet.rst → docs/73_gpcnet/gpcnet.rst b/docs/71_gpcnet/gpcnet.rst → docs/73_gpcnet/gpcnet.rst
diff --git a/docs/index.rst b/docs/index.rst
@@ -47,9 +47,10 @@ FCR Benchmarks Project. ATTENTION: This page is a work in progress and nothing i
    :numbered:
    :caption: Microbenchmarks
 
-   70_omb/omb
-   71_gpcnet/gpcnet
-   72_phloem/phloem
+   70_phloem/phloem
+   71_omb/omb
+   72_smb/smb
+   73_gpcnet/gpcnet
    80_ior/ior
    81_mdtest/mdtest
    82_dlio/dlio