Refactor bdy imp3 by MetBenjaminWent · Pull Request #561 · MetOffice/lfric_apps

Benjamin Went (MetBenjaminWent) · 2026-06-18T14:29:01Z

PR Summary

Sci/Tech Reviewer:
Code Reviewer:

Linked Issues:
Umbrella: #560
Umbrella Work: #106

Code Quality Checklist

I have performed a self-review of my own code
My code follows the project's style guidelines
Comments have been included that aid understanding and enhance the readability of the code
My changes generate no new warnings
All automated checks in the CI pipeline have completed successfully

Testing

I have tested this change locally, using the LFRic Apps rose-stem suite
If any tests fail (rose-stem or CI) the reason is understood and acceptable (e.g. kgo changes)
I have added tests to cover new functionality as appropriate (e.g. system tests, unit tests, etc.)
Any new tests have been assigned an appropriate amount of compute resource and have been allocated to an appropriate testing group (i.e. the developer tests are for jobs which use a small amount of compute resource and complete in a matter of minutes)

trac.log

Security Considerations

I have reviewed my changes for potential security issues
Sensitive data is properly handled (if applicable)
Authentication and authorisation are properly implemented (if applicable)

Performance Impact

Performance of the code has been considered and, if applicable, suitable performance measurements have been conducted

AI Assistance and Attribution

Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

Documentation

Where appropriate I have updated documentation related to this change and confirmed that it builds correctly

PSyclone Approval

If you have edited any PSyclone-related code (e.g. PSyKAl-lite, Kernel interface, optimisation scripts, LFRic data structure code) then please contact the TCD Team

Sci/Tech Review

I understand this area of code and the changes being added
The proposed changes correspond to the pull request description
Documentation is sufficient (do documentation papers need updating)
Sufficient testing has been completed

(Please alert the code reviewer via a tag when you have approved the SR)

Code Review

All dependencies have been resolved
Related Issues have been properly linked and addressed
CLA compliance has been confirmed
Code quality standards have been met
Tests are adequate and have passed
Documentation is complete and accurate
Security considerations have been addressed
Performance impact is acceptable

…just time it on it's own

Benjamin Went (MetBenjaminWent) · 2026-06-19T10:56:57Z

vern_comb_after_3n_16T_run2.txt:| bdy_imp3 | 23.37296 | 22.793 | 24.303 | 23.37296 | 22.793 | 24.303 | 421.66104 | 480 | 3.54854 | 0.04869 |
vern_comb_after_3n_1T_run2.txt:| bdy_imp3 | 0.4702 | 0.375 | 0.745 | 0.4702 | 0.375 | 0.745 | 306.8558 | 480 | 0.15091 | 0.00098 |
vern_comb_after_3n_2T_run2.txt:| bdy_imp3 | 1.87144 | 1.63 | 2.349 | 1.87144 | 1.63 | 2.349 | 231.32831 | 480 | 0.73924 | 0.0039 |
vern_comb_after_3n_4T_run2.txt:| bdy_imp3 | 4.64524 | 4.215 | 5.152 | 4.64524 | 4.215 | 5.152 | 227.4877 | 480 | 1.596 | 0.00968 |
vern_comb_after_3n_8T_run2.txt:| bdy_imp3 | 9.17898 | 8.857 | 10.841 | 9.17898 | 8.857 | 10.841 | 276.30565 | 480 | 2.24013 | 0.01912 |
vern_comb_before_3n_16T_run1.txt:| bdy_imp3 | 0.60892 | 0.595 | 0.629 | 0.60892 | 0.595 | 0.629 | 634.10033 | 480 | 0.09508 | 0.00127 |
vern_comb_before_3n_1T_run1.txt:| bdy_imp3 | 0.2106 | 0.058 | 0.522 | 0.2106 | 0.058 | 0.522 | 310.62336 | 480 | 0.06743 | 0.00044 |
vern_comb_before_3n_2T_run1.txt:| bdy_imp3 | 0.19568 | 0.131 | 0.265 | 0.19568 | 0.131 | 0.265 | 250.37705 | 480 | 0.0777 | 0.00041 |
vern_comb_before_3n_4T_run1.txt:| bdy_imp3 | 0.20734 | 0.166 | 0.259 | 0.20734 | 0.166 | 0.259 | 284.07115 | 480 | 0.07264 | 0.00043 |
vern_comb_before_3n_8T_run1.txt:| bdy_imp3 | 0.31942 | 0.27 | 0.42 | 0.31942 | 0.27 | 0.42 | 397.84844 | 480 | 0.07965 | 0.00067 |

Its way way worse doing it this way. Hmmmm. Not as easy to workout

Benjamin Went (MetBenjaminWent) · 2026-06-19T12:23:38Z

So its not the removal of the vector loop. It's the size of the loop (and or possibly the dynamic schedule).

Splitting it back into '3' loops, each with their own OMP coverage (one_over_v is effectively a loop over an array which is i_len / threads) has brought the performance back into some normalcy.

vern_comb_after_3n_16T_run1.txt:| bdy_imp3 | 0.69154 | 0.676 | 0.735 | 0.69154 | 0.676 | 0.735 | 633.28846 | 480 | 0.10804 | 0.00144 |
vern_comb_after_3n_1T_run1.txt:| bdy_imp3 | 0.17841 | 0.057 | 0.545 | 0.17841 | 0.057 | 0.545 | 312.58077 | 480 | 0.05683 | 0.00037 |
vern_comb_after_3n_2T_run1.txt:| bdy_imp3 | 0.21947 | 0.15 | 0.281 | 0.21947 | 0.15 | 0.281 | 250.61117 | 480 | 0.08703 | 0.00046 |
vern_comb_after_3n_4T_run1.txt:| bdy_imp3 | 0.2397 | 0.199 | 0.308 | 0.2397 | 0.199 | 0.308 | 287.77153 | 480 | 0.08272 | 0.0005 |
vern_comb_after_3n_8T_run1.txt:| bdy_imp3 | 0.36944 | 0.34 | 0.495 | 0.36944 | 0.34 | 0.495 | 396.13619 | 480 | 0.09233 | 0.00077 |
vern_comb_before_3n_16T_run1.txt:| bdy_imp3 | 0.60892 | 0.595 | 0.629 | 0.60892 | 0.595 | 0.629 | 634.10033 | 480 | 0.09508 | 0.00127 |
vern_comb_before_3n_1T_run1.txt:| bdy_imp3 | 0.2106 | 0.058 | 0.522 | 0.2106 | 0.058 | 0.522 | 310.62336 | 480 | 0.06743 | 0.00044 |
vern_comb_before_3n_2T_run1.txt:| bdy_imp3 | 0.19568 | 0.131 | 0.265 | 0.19568 | 0.131 | 0.265 | 250.37705 | 480 | 0.0777 | 0.00041 |
vern_comb_before_3n_4T_run1.txt:| bdy_imp3 | 0.20734 | 0.166 | 0.259 | 0.20734 | 0.166 | 0.259 | 284.07115 | 480 | 0.07264 | 0.00043 |
vern_comb_before_3n_8T_run1.txt:| bdy_imp3 | 0.31942 | 0.27 | 0.42 | 0.31942 | 0.27 | 0.42 | 397.84844 | 480 | 0.07965 | 0.00067 |

It also highlights that blocking was still beneficial. We should cross compare another datapoint.

Benjamin Went (MetBenjaminWent) added 4 commits June 18, 2026 13:35

push up initial ideas to test performance

e75cf31

Remove the extra dim for psyclone as theres no point - test this and …

6a068a5

…just time it on it's own

correct step

0082e71

remove ii

d734a67

Benjamin Went (MetBenjaminWent) self-assigned this Jun 18, 2026

Benjamin Went (MetBenjaminWent) mentioned this pull request Jun 18, 2026

Further refactor bdy_imp3 for psyclone #560

Open

housekeeping and both dynamic

1349325

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor bdy imp3#561

Refactor bdy imp3#561
Benjamin Went (MetBenjaminWent) wants to merge 5 commits into
MetOffice:mainfrom
MetBenjaminWent:refactor-bdy_imp3

Benjamin Went (MetBenjaminWent) commented Jun 18, 2026 •

edited

Loading

Uh oh!

Benjamin Went (MetBenjaminWent) commented Jun 19, 2026

Uh oh!

Benjamin Went (MetBenjaminWent) commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Benjamin Went (MetBenjaminWent) commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Code Quality Checklist

Testing

trac.log

Security Considerations

Performance Impact

AI Assistance and Attribution

Documentation

PSyclone Approval

Sci/Tech Review

Code Review

Uh oh!

Benjamin Went (MetBenjaminWent) commented Jun 19, 2026

Uh oh!

Benjamin Went (MetBenjaminWent) commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Benjamin Went (MetBenjaminWent) commented Jun 18, 2026 •

edited

Loading