feat: LHCb modules migration#97
Conversation
|
There is an issue with the FailoverRequest. |
c3f5c70 to
93cb40c
Compare
|
This PR will now tackle the migration of every single module (starting from the ones used in MCSim) by importing LHCb specific functions instead of rewritting them from the ground up |
First approach, needs review
Improve UploadLogFile tests
ca676da to
98ccc37
Compare
6511abe to
1da58a2
Compare
chore: Create proper wf_commons file update
c3b1414 to
947bc8b
Compare
2e7ae71 to
ed1c6f0
Compare
| logger.info("Preparing DISET request for %s", bk_file) | ||
|
|
||
| logger.info("Creating DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK in order to disable the Watchdog") | ||
| with open("DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f: |
There was a problem hiding this comment.
When you create new files like this, please make sure they are created in job_path.
In the PushJobAgent, we might have multiple job wrappers running in parallel in the future, you want to make sure they will not collide.
| with open("DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f: | |
| with open(job_path / "DISABLE_WATCHDOG_CPU_WALLCLOCK_CHECK", "w") as f: |
| class UploadOutputData(PostProcessCommand): | ||
| """Registers every output generated to the corresponding SE and Catalog or to the FailoverSE in case of failure.""" | ||
|
|
||
| def _execute(self, job_path: os.PathLike, workflow_commons: WorkflowCommons, **kwargs) -> None: |
There was a problem hiding this comment.
Out of curiosity, why os.PathLike here and not pathlib.Path?
There was a problem hiding this comment.
As far as I'm aware, the recommended way of typing path parameters is by using os.PathLike, which reminds me that that should be Union[str, os.PathLike], as str is not a subtype of os.PathLike.
On the other hand, I mostly use pathlib.Path for file operations, so maybe is not a bad idea to just use pathlib.
There was a problem hiding this comment.
Ok I didn't know. From what I read, os.PathLike[str] is abstract and useful in your specific context for typing, but then you can (should?) "cast" it as a pathlib.Path to manipulate it.
Now given that we completely control this area of the code, I naively assume that having pathlib.Path is safe.
But I would happily follow your recommendation 🙂
|
|
||
| # Write to file | ||
| bfilename = f"bookkeeping_{step_commons.id}.xml" | ||
| with open(bfilename, "wb") as bfile: |
There was a problem hiding this comment.
For instance, here you also want this to be part of job_path, let's write it explicitly may be.
| with open(bfilename, "wb") as bfile: | |
| with open(job_path / bfilename, "wb") as bfile: |
| except SErrorException as e: | ||
| logger.error("Failed to list the log directory\n%s", e) | ||
|
|
||
| if file_list: |
There was a problem hiding this comment.
What happens if an exception is raised? I assume file_list would not exist and you would get another exception?
Can you test that please?
There was a problem hiding this comment.
That's true, I missed that. I will check every newly added returnValueOrRaise, as it changes the code blocks.
| value = returnValueOrRaise(workflow_commons.file_report.generateForwardDISET()) | ||
| if not value: | ||
| logger.info("On second attempt, files correctly reported to TransformationDB") | ||
| elif workflow_commons.step_status == StepStatus.Done: |
There was a problem hiding this comment.
Also based on my previous comment in the WorkflowAccounting: it would be interesting to see whether there is a way to get the status of the cwl execution vs a step execution.
Because if this is possible, then may be there is a way to reproduce exactly what we have in the workflow modules with the conditions like if workflowStatus and jobStatus...
| f"Values for StepAccounting are wrong. Here are the given data: {data_dict}" | ||
| ) from e | ||
|
|
||
| workflow_commons.dsc.addRegister(job_step) |
There was a problem hiding this comment.
Why wouldn't it?
When the command execution ends, the registers are extracted from the client and stored for later use.
| _request = PrivateAttr(default=None) | ||
| _failover_request = PrivateAttr(default=None) | ||
| _job_report = PrivateAttr(default=None) | ||
| _file_report = PrivateAttr(default_factory=FileReport) | ||
| _data_manager = PrivateAttr(default_factory=DataManager) | ||
| _bk_client = PrivateAttr(default_factory=BookkeepingClient) | ||
| _dsc = PrivateAttr(default_factory=DataStoreClient) |
There was a problem hiding this comment.
I would not carry the clients, I would only add the clients in the commands where needed. Why did you decide to have all them here?
| _request = PrivateAttr(default=None) | |
| _failover_request = PrivateAttr(default=None) | |
| _job_report = PrivateAttr(default=None) | |
| _file_report = PrivateAttr(default_factory=FileReport) | |
| _data_manager = PrivateAttr(default_factory=DataManager) | |
| _bk_client = PrivateAttr(default_factory=BookkeepingClient) | |
| _dsc = PrivateAttr(default_factory=DataStoreClient) |
The jobReport could potentially be part of the signature of pre/post process commands though, because it's also used in the JobWrapper (we could imagine having it reporting the "application status" being the name of the pre/post process command - that would be in the base classes - and passing it to the commands so that they can influence the status of the job if needed).
There was a problem hiding this comment.
My idea was to maintain the same structure and keep the code on each command as clean as possible.
However, I understand that instantiating all clients when most are not needed is unefficient.
The jobReport could potentially be part of the signature of pre/post process commands though, because it's also used in the JobWrapper
This could be a nice addition. Also, Ryan's job_wrapper creates a DataManager. We could give the possibility of setting the clients in CommandBase and if they are not present while executing the command, just instantatiate them.
| Failed = "Failed" | ||
|
|
||
|
|
||
| class Step(BaseModel): |
There was a problem hiding this comment.
There might be a way of having one step per CWL workflow step.
Check the workflow examples we have here (this is not MCSim but it's very similar): https://gitlab.cern.ch/lhcb-dpa/analysis-productions/lbapcommon/-/blob/master/tests/example_workflows/complex_workflow_with_filtering/AnaProd_example_workflows_job_tuple%2Csplit%2Cfilter_B0.cwl?ref_type=heads
There was a problem hiding this comment.
Okay, I will look into this
There was a problem hiding this comment.
Could you split that file into multiple smaller files please? One per command
See #87
First approach, needs review.
I tried to use the current Mock classes (MockDataManager), but I don't think this is the correct way. However, it should not vary much.
I open it as a draft because I still need to create 2 more tests and change the interactions with DIRAC.