|
| 1 | +.. _Multipoint Optimization with MPI: |
| 2 | + |
| 3 | +.. note:: This feature requires MPI, which does not come with OpenAeroStruct by default. |
| 4 | + |
| 5 | +Parallel Multipoint Optimization using MPI |
| 6 | +========================================== |
| 7 | + |
| 8 | +Multipoint analysis or optimization can be parallelized to reduce the runtime. |
| 9 | +Because each flight condition (or point) is independent, it is `embarassingly parallel`, meaning that we can easily parallelize these analyses. |
| 10 | + |
| 11 | +Here, we will parallelize the :ref:`previous multipoint aerostructural example (Q400)<Multipoint Optimization>`. |
| 12 | +This requires a little modification to the original serial runscript. |
| 13 | + |
| 14 | +Runscript modifications |
| 15 | +----------------------- |
| 16 | +We first import MPI. |
| 17 | +If this line does not work, make sure that you have a working MPI installation. |
| 18 | + |
| 19 | +.. code-block:: |
| 20 | + |
| 21 | + from mpi4py import MPI |
| 22 | +
|
| 23 | +You may need to turn off the numpy multithreading. |
| 24 | +This can be done by adding the following lines before importing numpy. |
| 25 | +The name of environment variable may be different depending on the system. |
| 26 | + |
| 27 | +.. code-block:: |
| 28 | +
|
| 29 | + import os |
| 30 | + os.environ['OPENBLAS_NUM_THREADS'] = '1' |
| 31 | +
|
| 32 | +Then, let's set up the problem in the same way as the serial runscript. |
| 33 | + |
| 34 | +.. code-block:: |
| 35 | + |
| 36 | + prob = om.Problem() |
| 37 | + |
| 38 | + # Setup problem information in indep_var_comp |
| 39 | + ... |
| 40 | +
|
| 41 | + # Add AerostructGeometry to the model |
| 42 | + ... |
| 43 | +
|
| 44 | +Next, we need to add AS_points under a ``ParallelGroup`` instead of directly under the ``prob.model``. |
| 45 | + |
| 46 | +.. literalinclude:: ../../tests/test_multipoint_parallel.py |
| 47 | + :dedent: 8 |
| 48 | + :start-after: # [rst Setup ParallelGroup (beg)] |
| 49 | + :end-before: # [rst Setup ParallelGroup (end)] |
| 50 | + |
| 51 | +After establishing variable connections and setting up the driver, we define the optimization objective and constraints. |
| 52 | +Here, we will setup the parallel derivative computations. |
| 53 | +In this example, we have 6 functions of interest (1 objective and 5 constraints), which would require 6 linear solves for reverse-mode derivatives in series. |
| 54 | +Among 6 functions, 4 depend only on AS_point_0, and 2 depend only on AS_point_1. |
| 55 | +Therefore, we can form 2 pairs and perform linear solves in parallel. |
| 56 | +We specify ``parallel_deriv_color`` to tell OpenMDAO which function's derivatives can be solved for in parallel. |
| 57 | + |
| 58 | +.. literalinclude:: ../../tests/test_multipoint_parallel.py |
| 59 | + :dedent: 8 |
| 60 | + :start-after: # [rst Parallel deriv color setup 1 (beg)] |
| 61 | + :end-before: # [rst Parallel deriv color setup 1 (end)] |
| 62 | + |
| 63 | +Furthermore, we will add another dummy (nonsense) constraint to explain how parallelization works for reverse-mode derivatives. |
| 64 | +This dummy constraint (sum of the fuel burns from AS_point_0 and AS_point_1) depends on both AS points. |
| 65 | +In this case, the linear solves of AS_point_0 and AS_point_1 will be parallelized. |
| 66 | + |
| 67 | +.. literalinclude:: ../../tests/test_multipoint_parallel.py |
| 68 | + :dedent: 8 |
| 69 | + :start-after: # [rst Parallel deriv color setup 2 (beg)] |
| 70 | + :end-before: # [rst Parallel deriv color setup 2 (end)] |
| 71 | + |
| 72 | +Finally, let's change the linear solver from default. |
| 73 | +This step is not necessary and not directly relevant to parallelization, but the ``LinearBlockGS`` solver works better on a fine mesh than the default ``DirectSolver``. |
| 74 | + |
| 75 | +.. literalinclude:: ../../tests/test_multipoint_parallel.py |
| 76 | + :dedent: 8 |
| 77 | + :start-after: # [rst Change linear solver (beg)] |
| 78 | + :end-before: # [rst Change linear solver (end)] |
| 79 | + |
| 80 | + |
| 81 | +Complete runscript |
| 82 | +------------------ |
| 83 | + |
| 84 | +.. embed-code:: |
| 85 | + openaerostruct.tests.test_multipoint_parallel.Test.test_multipoint_MPI |
| 86 | + |
| 87 | +To run this example in parallel with two processors, use the following command: |
| 88 | + |
| 89 | +.. code-block:: bash |
| 90 | +
|
| 91 | + $ mpirun -n 2 python <name of script>.py |
| 92 | +
|
| 93 | +Solver Outputs |
| 94 | +-------------- |
| 95 | +The stdout of the above script would look like the following. |
| 96 | +The solver outputs help us understand how solvers are parallelized for analysis and total derivative computations. |
| 97 | + |
| 98 | +.. code-block:: bash |
| 99 | +
|
| 100 | + # Nonlinear solver in parallel |
| 101 | + =========================== |
| 102 | + parallel.AS_point_0.coupled |
| 103 | + =========================== |
| 104 | +
|
| 105 | + =========================== |
| 106 | + parallel.AS_point_1.coupled |
| 107 | + =========================== |
| 108 | + NL: NLBGS 1 ; 82168.4402 1 |
| 109 | + NL: NLBGS 1 ; 79704.5639 1 |
| 110 | + NL: NLBGS 2 ; 63696.5109 0.775194354 |
| 111 | + NL: NLBGS 2 ; 68552.4805 0.860082248 |
| 112 | + NL: NLBGS 3 ; 2269.83605 0.0276241832 |
| 113 | + NL: NLBGS 3 ; 2641.30776 0.0331387267 |
| 114 | + NL: NLBGS 4 ; 26.8901082 0.000327255917 |
| 115 | + NL: NLBGS 4 ; 33.4963389 0.000420256222 |
| 116 | + NL: NLBGS 5 ; 0.20014208 2.43575367e-06 |
| 117 | + NL: NLBGS 5 ; 0.273747809 3.43453117e-06 |
| 118 | + NL: NLBGS 6 ; 0.000203058798 2.47125048e-09 |
| 119 | + NL: NLBGS 6 ; 0.00033072442 4.14937871e-09 |
| 120 | + NL: NLBGS 7 ; 3.3285346e-06 4.05086745e-11 |
| 121 | + NL: NLBGS 7 ; 5.16564573e-06 6.48099115e-11 |
| 122 | + NL: NLBGS 8 ; 9.30405466e-08 1.13231487e-12 |
| 123 | + NL: NLBGS Converged |
| 124 | + NL: NLBGS 8 ; 1.63279302e-07 2.04855649e-12 |
| 125 | + NL: NLBGS 9 ; 2.01457772e-09 2.5275563e-14 |
| 126 | + NL: NLBGS Converged |
| 127 | +
|
| 128 | + # Linear solver for "parcon1". Derivatives of AS_point_0.fuelburn and AS_point_1.L_equals_W in parallel. |
| 129 | + =========================== |
| 130 | + parallel.AS_point_0.coupled |
| 131 | + =========================== |
| 132 | +
|
| 133 | + =========================== |
| 134 | + parallel.AS_point_1.coupled |
| 135 | + =========================== |
| 136 | + LN: LNBGS 0 ; 180.248073 1 |
| 137 | + LN: LNBGS 0 ; 1.17638541e-05 1 |
| 138 | + LN: LNBGS 1 ; 0.00284457871 1.57814653e-05 |
| 139 | + LN: LNBGS 1 ; 1.124189e-06 0.0955629836 |
| 140 | + LN: LNBGS 2 ; 1.87700622e-08 0.00159557081 |
| 141 | + LN: LNBGS 2 ; 4.66688449e-05 2.58914529e-07 |
| 142 | + LN: LNBGS 3 ; 1.13549461e-11 9.65240308e-07 |
| 143 | + LN: LNBGS Converged |
| 144 | + LN: LNBGS 3 ; 8.18485966e-08 4.54088609e-10 |
| 145 | + LN: LNBGS 4 ; 9.00103905e-10 4.99369503e-12 |
| 146 | + LN: LNBGS Converged |
| 147 | +
|
| 148 | + # Linear solver for "parcon2". Derivatives of AS_point_0.CL and AS_point_1.wing_perf.failure in parallel. |
| 149 | + =========================== |
| 150 | + parallel.AS_point_1.coupled |
| 151 | + =========================== |
| 152 | +
|
| 153 | + =========================== |
| 154 | + parallel.AS_point_0.coupled |
| 155 | + =========================== |
| 156 | + LN: LNBGS 0 ; 334.283603 1 |
| 157 | + LN: LNBGS 0 ; 0.00958374526 1 |
| 158 | + LN: LNBGS 1 ; 2.032696e-05 6.08075293e-08 |
| 159 | + LN: LNBGS 1 ; 2.02092209e-06 0.000210869762 |
| 160 | + LN: LNBGS 2 ; 2.3346978e-06 6.98418281e-09 |
| 161 | + LN: LNBGS 2 ; 2.90180431e-08 3.02783956e-06 |
| 162 | + LN: LNBGS 3 ; 4.98483883e-08 1.49120052e-10 |
| 163 | + LN: LNBGS 3 ; 8.63240127e-11 9.0073359e-09 |
| 164 | + LN: LNBGS Converged |
| 165 | + LN: LNBGS 4 ; 5.58667374e-11 1.67123774e-13 |
| 166 | + LN: LNBGS Converged |
| 167 | +
|
| 168 | + # Linear solver for derivatives of fuel_vol_delta.fuel_vol_delta (not parallelized) |
| 169 | + =========================== |
| 170 | + parallel.AS_point_0.coupled |
| 171 | + =========================== |
| 172 | + LN: LNBGS 0 ; 0.224468335 1 |
| 173 | + LN: LNBGS 1 ; 3.54243924e-06 1.57814653e-05 |
| 174 | + LN: LNBGS 2 ; 5.81181131e-08 2.58914529e-07 |
| 175 | + LN: LNBGS 3 ; 1.01928513e-10 4.54088604e-10 |
| 176 | + LN: LNBGS 4 ; 1.12121714e-12 4.99499023e-12 |
| 177 | + LN: LNBGS Converged |
| 178 | +
|
| 179 | + # Linear solver for derivatives of fuel_diff (not parallelized) |
| 180 | + =========================== |
| 181 | + parallel.AS_point_0.coupled |
| 182 | + =========================== |
| 183 | + LN: LNBGS 0 ; 0.21403928 1 |
| 184 | + LN: LNBGS 1 ; 3.37785348e-06 1.57814653e-05 |
| 185 | + LN: LNBGS 2 ; 5.54178795e-08 2.58914529e-07 |
| 186 | + LN: LNBGS 3 ; 9.71927996e-11 4.54088612e-10 |
| 187 | + LN: LNBGS Converged |
| 188 | +
|
| 189 | + # Linear solver for derivatives of fuel_sum in parallel. |
| 190 | + =========================== |
| 191 | + parallel.AS_point_0.coupled |
| 192 | + =========================== |
| 193 | +
|
| 194 | + =========================== |
| 195 | + parallel.AS_point_1.coupled |
| 196 | + =========================== |
| 197 | + LN: LNBGS 0 ; 360.496145 1 |
| 198 | + LN: LNBGS 0 ; 511.274568 1 |
| 199 | + LN: LNBGS 1 ; 0.00568915741 1.57814653e-05 |
| 200 | + LN: LNBGS 1 ; 0.00838867553 1.64073788e-05 |
| 201 | + LN: LNBGS 2 ; 0.00013534629 2.64723299e-07 |
| 202 | + LN: LNBGS 2 ; 9.33376897e-05 2.58914529e-07 |
| 203 | + LN: LNBGS 3 ; 1.00754737e-07 1.97065811e-10 |
| 204 | + LN: LNBGS 3 ; 1.63697193e-07 4.54088609e-10 |
| 205 | + LN: LNBGS 4 ; 2.24690253e-09 4.39470819e-12 |
| 206 | + LN: LNBGS Converged |
| 207 | + LN: LNBGS 4 ; 1.80020781e-09 4.99369503e-12 |
| 208 | + LN: LNBGS Converged |
| 209 | +
|
| 210 | +
|
| 211 | +Comparing Runtime |
| 212 | +----------------- |
| 213 | +How much speedup can we get by parallelization? |
| 214 | +Here, we compared the runtime for the example above (but with a finer mesh of `nx=3` and `ny=61`). |
| 215 | +In this case, we achieved decent speedup in nonlinear analysis, but not so much in derivative computation. |
| 216 | +The actual speedup you can get depends on your problem setups, such as number of points (flight conditions) and functions of interest. |
| 217 | + |
| 218 | +.. list-table:: Runtime for Q400 example |
| 219 | + :widths: 30 35 35 |
| 220 | + :header-rows: 1 |
| 221 | + |
| 222 | + * - Case |
| 223 | + - Analysis walltime [s] |
| 224 | + - Derivatives walltime [s] |
| 225 | + * - Serial |
| 226 | + - 1.451 |
| 227 | + - 5.775 |
| 228 | + * - Parallel |
| 229 | + - 0.840 |
| 230 | + - 4.983 |
0 commit comments