Skip to content

on-the-fly computation of MSAD for trimers#76

Open
cyfraysse wants to merge 24 commits into
TheDisorderedOrganization:mainfrom
cyfraysse:feature/msad
Open

on-the-fly computation of MSAD for trimers#76
cyfraysse wants to merge 24 commits into
TheDisorderedOrganization:mainfrom
cyfraysse:feature/msad

Conversation

@cyfraysse
Copy link
Copy Markdown
Contributor

@cyfraysse cyfraysse commented May 12, 2026

Computes MSAD on-the-fly during simulation for three methods: Euler, Integral, and Threshold.

Two independent schedules:

  • compute_schedule: when to update rotation state (passed to Arianna)
  • output_schedule: when to write to disk (managed internally, log-spaced)

New file:

  • src/msad.jl — MSADState, MSADTracker, body_frame, rotation_vector
    Changed file:
    -src/Particles.jl

TOML usage:
[[simulation.output]]
algorithm = "MSADTracker"
scheduler_params = {linear_interval = 1}
parameters = {theta_T = @Theta@}
output_scheduler_params = {log_base = 2.0}

@cyfraysse cyfraysse requested a review from a team as a code owner May 12, 2026 14:29
@romainljsimon romainljsimon requested review from romainljsimon and removed request for a team May 12, 2026 14:39
@codecov
Copy link
Copy Markdown

codecov Bot commented May 12, 2026

Codecov Report

❌ Patch coverage is 0% with 111 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/msad.jl 0.00% 100 Missing ⚠️
src/ParticlesMC.jl 0.00% 11 Missing ⚠️
Files with missing lines Coverage Δ
src/ParticlesMC.jl 7.69% <0.00%> (-0.80%) ⬇️
src/msad.jl 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/ParticlesMC.jl Outdated
Comment on lines +240 to +264
elseif alg == "MSADTracker"
# read theta_T from parameters
parameters = get(output, "parameters", Dict())
theta_T = Float64(get(parameters, "theta_T", π/4))

# build output schedule separately from compute schedule
output_scheduler_params = get(output, "output_scheduler_params", Dict())
if "log_base" in keys(output_scheduler_params)
output_sched = build_schedule(steps, burn, 2.0)
elseif "linear_interval" in keys(output_scheduler_params)
output_sched = build_schedule(steps, burn,
output_scheduler_params["linear_interval"])
else
output_sched = sched # default: same as compute
end

# convert to Vector{Int} — build_schedule returns this already
algorithm = (
algorithm=MSADTracker,
scheduler=sched, # compute schedule
theta_T=theta_T,
output_schedule=output_sched,
path=output_path,
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code never gets to this branch, alg is undefined anyway.
Need to remove, this is a copy/paste mistake from the output section below.

@V-Francois
Copy link
Copy Markdown
Contributor

[[simulation.output]]
algorithm = "MSADTracker"
scheduler_params = {linear_interval = 1}
parameters = {theta_T = @Theta@}
output_scheduler_params = {log_base = 2.0}

This is a bit misleading, as for the other outputs, scheduler_params codes for the times at which you output.
I propose to use this as the output_scheduler_params, without renaming.
And use a second name, called something like compute_scheduler_params, that encodes for how frequently you want to compute the rotations

Comment thread src/msad.jl Outdated

# open output file — write header
tracker.files[c] = open(tracker.paths[c], "w")
println(tracker.files[c], "# t msad_euler msad_integral msad_thresh")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove double space between column names

Comment thread src/msad.jl
for m in 1:N_mol
dR = state.R_ref_thresh[m]' * R_all[m]
phi_current[m] = rotation_vector(dR)
phi_total[m] = state.phi_acc[m] + phi_current[m]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is redundant, as it can be recomputed any time from the other two

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, plus I don't think creating two empty vectors is necessary.

I guess one could write:

dR             = state.R_ref_thresh[m]' * R_all[m]
phi_current = rotation_vector(dR)
if norm(phi_current) >= tracker.theta_T
       state.phi_acc[m]   = state.phi_acc[m] + phi_current
       state.R_ref_thresh[m] = R_all[m]
end

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Then I also changed

phi_total[m]

by

state.phi_acc[m] + phi_current[m]

in the output for the threshold method

Copy link
Copy Markdown
Member

@romainljsimon romainljsimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @cyfraysse for your work and this PR, it's great!

However, I bet that the MSAD is not the only observable we're going to want to look at in the future. For example, we'd like to study the angular Van Hove (angular distributions) in our molecules.
Thus, I think it should be better to store trajectories of angular displacements of all molecules (for both the integral and the threshold method), just as it's done for positions. This would allow us to compute the MSAD, the Van Hove (why not the Fs(q,t)?) in postprocessing.

What do you think @cyfraysse @V-Francois ?

I also added a few minor comments.

Comment thread src/msad.jl
Comment on lines +42 to +50
function rotation_vector(R::SMatrix{3,3,T}) where {T}
cos_θ = clamp((tr(R) - 1) / 2, -one(T), one(T))
# skew-symmetric part encodes sin(θ)*n
skew = (R - R') / 2
sin_n = SVector(skew[3,2], skew[1,3], skew[2,1]) # julia starts index at 1
sin_θ = norm(sin_n)
θ = atan(sin_θ, cos_θ)
return sin_θ > 1e-10 ? (θ / sin_θ) * sin_n : zero(SVector{3,T}) ## might need to add the theta = pi case (unprobable for MC steps)
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but it's not the best option is $1/sin(\theta)$ is ill-conditioned near $0$ and $\pi$. The best way to calculate the rotation vector from the rotation matrix is described in Sec IIC of https://arxiv.org/pdf/2604.21512

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with what is said in Sec IIC. However, taking n as the eigenvector associated with the eigenvalue 1 is misleading because n and -n are solutions. Both values are taken randomly when extracting eigenvectors and this lead to "false" rotation of ±π. Using the skew matrix we remove this ambiguity. Then should I add a case where theta is close to pi?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why would this be a problem ?
Anyways for the thresh method, you take the absolute value of the angle to check the condition.
For the integral method, you sum $\theta n$ which has no ambiguity.
Am I missing something ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is irrelevant for the condition check in th threshold method. However, when doing the accumulation sum(θn) we will accumulate random sign flip no? And then norm(ɸ) does not represent the true rotation.

When I was validating my operation in python post process I had to use skew symmetric part to define n instead of eigenvectors.

Copy link
Copy Markdown
Member

@romainljsimon romainljsimon May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you do the calculation you indeed find two solutions $(-\theta, -n)$ and $(\theta, n)$. However, the vector $\theta n = -\theta \times -n$ (Euler vector) is unique. So this shouldn't be a problem :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK yes you're right sorry... I am changing for the eigenvector rn.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! There should be an implementation here : https://github.com/romainljsimon/MoonWalk.jl

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question : when using the eigenvector/value decomposition Julia might need to store things in memory. Isn't that going to slow down the simulation?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, yes it is a slower but safer implementation!

Comment thread src/msad.jl Outdated
# called once at t = 0, set the system, the mutable struct and the storing paths
########## ##########

function Arianna.initialise(tracker::MSADTracker, simulation::Simulation)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(algorithm::MSADTracker, simulation::Simulation)

For consistency with Arianna.jl.
So the same remark for the other algorithm functions.

Comment thread src/msad.jl
for m in 1:N_mol
dR = state.R_ref_thresh[m]' * R_all[m]
phi_current[m] = rotation_vector(dR)
phi_total[m] = state.phi_acc[m] + phi_current[m]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, plus I don't think creating two empty vectors is necessary.

I guess one could write:

dR             = state.R_ref_thresh[m]' * R_all[m]
phi_current = rotation_vector(dR)
if norm(phi_current) >= tracker.theta_T
       state.phi_acc[m]   = state.phi_acc[m] + phi_current
       state.R_ref_thresh[m] = R_all[m]
end

@cyfraysse
Copy link
Copy Markdown
Contributor Author

Thank you for your review @romainljsimon! I have changed what you mentioned with @V-Francois.

About the observables... we had this discussion with @V-Francois. If we store the whole trajectory of phi for the threshold and integral methods we will be back at the storage problem no? And if we log sample it then we will have very few points for the distribution.

We thought about adding a third scheduler that would be triggered once in the long time regime that will store phi trajectories for few time step there... what do you think about that?

@romainljsimon
Copy link
Copy Markdown
Member

Thank you for your review @romainljsimon! I have changed what you mentioned with @V-Francois.

:)

About the observables... we had this discussion with @V-Francois. If we store the whole trajectory of phi for the threshold and integral methods we will be back at the storage problem no? And if we log sample it then we will have very few points for the distribution.

Yes I agree that we shouldn't store them at each time step or we'll be back to the storage problem. I think the log sampling is the way to go, I'm not sure why this would be a problem for the distribution, that's how we do it for translations. You can run multiple simulations in parallel with the same scheduler, and have a distribution at different times with a lot of points. Again, am I missing something ?
+This allows to calculate other observables like the fsqt.

@cyfraysse
Copy link
Copy Markdown
Contributor Author

Ok with the multiple simulations in parallel I totally agree, thank you! Then we only need to extract in log the phi for the three methods. Post process for MSAD and Van Hove. For Fs C2 and MSD we continue to extract the configurations in log also. Am I right?

@romainljsimon
Copy link
Copy Markdown
Member

Ok with the multiple simulations in parallel I totally agree, thank you! Then we only need to extract in log the phi for the three methods. Post process for MSAD and Van Hove. For Fs C2 and MSD we continue to extract the configurations in log also. Am I right?

Yes I agree!
For the Euler method you only need the positions at time t so no need to store $\phi$ for this method I guess.

Comment thread src/msad.jl Outdated
files_thresh::Vector{IOStream} # file to write rotation vector in threshold method
end

function MSADTracker(chains, theta_T::T, compute_schedule::Vector{Int}, output_schedule::Vector{Int}, path::String) where {T}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also thinking that StorePhiTrajectories or something of the kind would be a better name for this as we're not calculating directly the MSAD anymore.

Maybe even two functions:
StorePhiIntegralTrajectory
StorePhiThreshTrajectory

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Idea! Just one point about doing two functions, we will loose the shared states. Like we will always have to compute 2 times R_all for example.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to do only one function and ask for one or both methods? Boolean parameter?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about this, I'm not sure what would be the best solution.

Comment thread Project.toml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to commit this change ?

@cyfraysse
Copy link
Copy Markdown
Contributor Author

Running a simulation at T=2.0... I might have done something wrong in the accumulation The integral and threshold methods don't behave well. In the accumulation I don't see what is wrong.
image

@romainljsimon
Copy link
Copy Markdown
Member

Running a simulation at T=2.0... I might have done something wrong in the accumulation The integral and threshold methods don't behave well. In the accumulation I don't see what is wrong. image

Does the Euler Method converge to the good value ?

@cyfraysse
Copy link
Copy Markdown
Contributor Author

Yes it is. Only the marge time behaviour seems wrong. It seems like the phi vector is bounded in all methods

@romainljsimon
Copy link
Copy Markdown
Member

Yes it is. Only the marge time behaviour seems wrong. It seems like the phi vector is bounded in all methods

The marge time ?

@cyfraysse
Copy link
Copy Markdown
Contributor Author

The Large time sorry

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants