Skip to content

feat: LHCb modules migration#97

Draft
AcquaDiGiorgio wants to merge 26 commits into
DIRACGrid:mainfrom
AcquaDiGiorgio:issue-87-LHCb-UploadLogFile
Draft

feat: LHCb modules migration#97
AcquaDiGiorgio wants to merge 26 commits into
DIRACGrid:mainfrom
AcquaDiGiorgio:issue-87-LHCb-UploadLogFile

Conversation

@AcquaDiGiorgio
Copy link
Copy Markdown
Contributor

@AcquaDiGiorgio AcquaDiGiorgio commented Feb 4, 2026

See #87

First approach, needs review.

I tried to use the current Mock classes (MockDataManager), but I don't think this is the correct way. However, it should not vary much.

I open it as a draft because I still need to create 2 more tests and change the interactions with DIRAC.

@AcquaDiGiorgio AcquaDiGiorgio self-assigned this Feb 4, 2026
@AcquaDiGiorgio AcquaDiGiorgio linked an issue Feb 4, 2026 that may be closed by this pull request
4 tasks
@AcquaDiGiorgio
Copy link
Copy Markdown
Contributor Author

There is an issue with the FailoverRequest.
The DIRAC's workflow, during the finalize phase calls sendFailoverRequest, which through the ReqClient sends the stored requests at the workflow_commons dictionary to DIRAC.
Currently, in dirac-cwl we don't have persitency between command executions, so the operations created are lost.

@AcquaDiGiorgio AcquaDiGiorgio force-pushed the issue-87-LHCb-UploadLogFile branch from c3f5c70 to 93cb40c Compare February 16, 2026 11:34
@aldbr aldbr linked an issue Apr 15, 2026 that may be closed by this pull request
@AcquaDiGiorgio AcquaDiGiorgio changed the title feat: UploadLogFile command implementation feat: LHCb modules migration Apr 16, 2026
@AcquaDiGiorgio
Copy link
Copy Markdown
Contributor Author

This PR will now tackle the migration of every single module (starting from the ones used in MCSim) by importing LHCb specific functions instead of rewritting them from the ground up

Comment thread src/dirac_cwl/commands/__init__.py Outdated
Comment thread src/dirac_cwl/commands/workflow_accounting.py Outdated
Comment thread src/dirac_cwl/commands/bookkeeping_report.py Outdated
Comment thread src/dirac_cwl/commands/utils.py Outdated
Comment thread src/dirac_cwl/commands/utils.py Outdated
Comment thread src/dirac_cwl/commands/workflow_accounting.py Outdated
Comment thread src/dirac_cwl/commands/workflow_accounting.py Outdated
Comment thread src/dirac_cwl/commands/utils.py Outdated
Comment thread src/dirac_cwl/commands/utils.py Outdated
Comment thread src/dirac_cwl/commands/failover_request.py
logger.info("Preparing DISET request for %s", bk_file)

logger.info("Creating DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK in order to disable the Watchdog")
with open("DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you create new files like this, please make sure they are created in job_path.
In the PushJobAgent, we might have multiple job wrappers running in parallel in the future, you want to make sure they will not collide.

Suggested change
with open("DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f:
with open(job_path / "DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f:

class UploadOutputData(PostProcessCommand):
"""Registers every output generated to the corresponding SE and Catalog or to the FailoverSE in case of failure."""

def _execute(self, job_path: os.PathLike, workflow_commons: WorkflowCommons, **kwargs) -> None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why os.PathLike here and not pathlib.Path?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I'm aware, the recommended way of typing path parameters is by using os.PathLike, which reminds me that that should be Union[str, os.PathLike], as str is not a subtype of os.PathLike.
On the other hand, I mostly use pathlib.Path for file operations, so maybe is not a bad idea to just use pathlib.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I didn't know. From what I read, os.PathLike[str] is abstract and useful in your specific context for typing, but then you can (should?) "cast" it as a pathlib.Path to manipulate it.
Now given that we completely control this area of the code, I naively assume that having pathlib.Path is safe.
But I would happily follow your recommendation 🙂


# Write to file
bfilename = f"bookkeeping_{step_commons.id}.xml"
with open(bfilename, "wb") as bfile:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, here you also want this to be part of job_path, let's write it explicitly may be.

Suggested change
with open(bfilename, "wb") as bfile:
with open(job_path / bfilename, "wb") as bfile:

Comment thread src/dirac_cwl/commands/bookkeeping_report.py
except SErrorException as e:
logger.error("Failed to list the log directory\n%s", e)

if file_list:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if an exception is raised? I assume file_list would not exist and you would get another exception?
Can you test that please?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, I missed that. I will check every newly added returnValueOrRaise, as it changes the code blocks.

value = returnValueOrRaise(workflow_commons.file_report.generateForwardDISET())
if not value:
logger.info("On second attempt, files correctly reported to TransformationDB")
elif workflow_commons.step_status == StepStatus.Done:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also based on my previous comment in the WorkflowAccounting: it would be interesting to see whether there is a way to get the status of the cwl execution vs a step execution.

Because if this is possible, then may be there is a way to reproduce exactly what we have in the workflow modules with the conditions like if workflowStatus and jobStatus...

f"Values for StepAccounting are wrong. Here are the given data: {data_dict}"
) from e

workflow_commons.dsc.addRegister(job_step)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it works?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't it?
When the command execution ends, the registers are extracted from the client and stored for later use.

Comment on lines +165 to +171
_request = PrivateAttr(default=None)
_failover_request = PrivateAttr(default=None)
_job_report = PrivateAttr(default=None)
_file_report = PrivateAttr(default_factory=FileReport)
_data_manager = PrivateAttr(default_factory=DataManager)
_bk_client = PrivateAttr(default_factory=BookkeepingClient)
_dsc = PrivateAttr(default_factory=DataStoreClient)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not carry the clients, I would only add the clients in the commands where needed. Why did you decide to have all them here?

Suggested change
_request = PrivateAttr(default=None)
_failover_request = PrivateAttr(default=None)
_job_report = PrivateAttr(default=None)
_file_report = PrivateAttr(default_factory=FileReport)
_data_manager = PrivateAttr(default_factory=DataManager)
_bk_client = PrivateAttr(default_factory=BookkeepingClient)
_dsc = PrivateAttr(default_factory=DataStoreClient)

The jobReport could potentially be part of the signature of pre/post process commands though, because it's also used in the JobWrapper (we could imagine having it reporting the "application status" being the name of the pre/post process command - that would be in the base classes - and passing it to the commands so that they can influence the status of the job if needed).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to maintain the same structure and keep the code on each command as clean as possible.
However, I understand that instantiating all clients when most are not needed is unefficient.

The jobReport could potentially be part of the signature of pre/post process commands though, because it's also used in the JobWrapper

This could be a nice addition. Also, Ryan's job_wrapper creates a DataManager. We could give the possibility of setting the clients in CommandBase and if they are not present while executing the command, just instantatiate them.

Failed = "Failed"


class Step(BaseModel):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a way of having one step per CWL workflow step.
Check the workflow examples we have here (this is not MCSim but it's very similar): https://gitlab.cern.ch/lhcb-dpa/analysis-productions/lbapcommon/-/blob/master/tests/example_workflows/complex_workflow_with_filtering/AnaProd_example_workflows_job_tuple%2Csplit%2Cfilter_B0.cwl?ref_type=heads

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will look into this

Comment thread test/test_commands.py
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you split that file into multiple smaller files please? One per command

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate real LHCb workflow commands (Pre/PostExecution) LHCb Workflow: UploadLogFile command

2 participants