Skip to content

Fix issue 292: amending for mass spectrometry specific objects, properties and controlled vocabularies#302

Open
janlisec wants to merge 5 commits into
BAMresearch:mainfrom
caro-demi:jan-fix-issue292
Open

Fix issue 292: amending for mass spectrometry specific objects, properties and controlled vocabularies#302
janlisec wants to merge 5 commits into
BAMresearch:mainfrom
caro-demi:jan-fix-issue292

Conversation

@janlisec

Copy link
Copy Markdown

existing object types amended by new properties:

  • INSTRUMENT.MASS.SPEC
  • EXPERIMENTAL_STEP.MS_BATCH
    new object types:
  • INSTRUMENT.LC_SYSTEM
  • INSTRUMENT.GC_SYSTEM
    several new contr.vocab
    !! I removed (outcommented) one existing property (MS_IONIZATION_MODE) to avoid redundancy with a novel property MS_ION_POLARITY -- please check if this causes inconsistencies. If so I would modify the PR and use the old property.

@janlisec janlisec requested a review from JosePizarro3 as a code owner May 28, 2026 16:29
@JosePizarro3 JosePizarro3 added the datamodel Changes in the datamodel label May 28, 2026

@JosePizarro3 JosePizarro3 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for the work, I have some minor comments + 2 larger ones:

  1. Maybe it is a good idea to clarify the role of CHROMOGRAPHY_TYPE and the two objects (LC_SYSTEM, GC_SYSTEM) you want to define. It seems this could be better represented by a property of type OBJECT and a common object type called CHROMATOGRAPHY.
  2. Would it be a good idea to move MS-specific definitions to its own subfolder, .../datamodel/mass_spectrometry? Similar to what we have in welding/.

section="MS Information",
)
# changes
# I would remove this property, as it is redundant with MS_ION_POLARITY. However should it be in use already, I would keep this one and not establish MS_ION_POLARITY.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check this

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

code="MS_HYPHENATION_METHOD",
data_type="VARCHAR",
property_label="Hyphenation method",
description="""Hyphenation (DI, LC, GC, CE)//Probeninjektion (DI, LC, GC, CE)""",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these the only options or can there be more?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can consider removing this property or (better) subsituting against the new property CHROMATOGRAPHY_TYPE. MS_HYPHENATION_METHOD is basically redundant with CHROMATOGRAPHY_TYPE where the latter contains contr.vocab.

ms_intensity_range = PropertyTypeAssignment(
code="MS_INTENSITY_RANGE",
data_type="VARCHAR",
property_label="Intensity range",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is a range, I would like to see something else than a VARCHAR, perhaps a min/max couple of REAL properties (and the units)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could avoid this formal strictness here, but I can make the changes if you like.

code="MS_RESOLUTION",
data_type="REAL",
property_label="Resolution",
description="""Approximate Resolution//Ungefähre Auflösung""",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind elaborating? is this some sort of pixel or m/z?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolution {numeric, no unit} = m2 / (m2-m1) = value indicating if the peaks of two m/z measured at the same time can be clearly differentiated. We measure ion intensities (y) on a continuous scale for m/z (x). While mass precision is a parameter for the peak apex (is it at 123.4 or at 123.45), resolution is a parameter for peak width. Best example: consider compound CN. m would be 12.000 + 14.003 = 26.003. It has two isotopes, [13]CN = 13.007 + 14.003 = 27.010 and C[15]N = 12.000 + 15.000 = 27.000. We could measure both isotopic masses with sufficient precision independently but without sufficient resolution we would measure a single peak instead of two. 'approximate' is in there, because resolution is dependent on device type and mass, i.e. TOF-MS devices would have lower resolution for lower masses, Orbitrap devices would have higher resolution for lower masses.

section="MS Parameters",
)

ms_scan_rate = PropertyTypeAssignment(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I am a bit confused by this and the ms_resolution

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MS_SCAN_RATE is the time resolution of the device, either specified Hz [1/s] or sometimes in milli seconds per scan [ms].

)


class GcSystem(Instrument):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, for now, LcSystem and GcSystem contain the same properties, so I'd argue that both can be simply defined as a single entity and then add something like kind = ["LC", "GC"] to differentiate between them.

It is even related with CHROMATOGRAPHY_TYPE in the MassSpec object.

Or are you thinking of other potential properties that might be defined later on?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had long discussions with Caro on how to represent our intruments best. I would keep both systems apart because more system-type specific properties might be added in the future. Actually, we run these machines in combinations GC-MS or LC-MS, combining a separation device with one or several detector devices. I had originally specified INSTRUMENT.ANALYTICAL_SYSTEM to reflect this, but Caro suggested otherwise, stating that this was too broad.

from bam_masterdata.metadata.definitions import VocabularyTerm, VocabularyTypeDef
from bam_masterdata.metadata.entities import VocabularyType

# changes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can delete these

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

section="Technical Details",
)

chromatography = PropertyTypeAssignment(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you defined LcSystem and GcSystem I wondered whether this is necessary or, better, whether what you need is a property of type OBJECT

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

description="""Ion Chromatography//Ionenchromatographie""",
)

none = VocabularyTerm(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need of NONE, this can be assigned when the property is not mandatory and nothing is stored

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep it. It is good to be expicit here and allow to state that no separation method was used before injection to the MS.

description="""Ion polarity//Ionenpolarität""",
)

pos = VocabularyTerm(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more verbose please, POSITIVE, NEGATIVE, POSITIVE_NEGATIVE

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok :)

@janlisec

janlisec commented Jun 3, 2026

Copy link
Copy Markdown
Author

Hi, thanks for the work, I have some minor comments + 2 larger ones:

  1. Maybe it is a good idea to clarify the role of CHROMOGRAPHY_TYPE and the two objects (LC_SYSTEM, GC_SYSTEM) you want to define. It seems this could be better represented by a property of type OBJECT and a common object type called CHROMATOGRAPHY.
  2. Would it be a good idea to move MS-specific definitions to its own subfolder, .../datamodel/mass_spectrometry? Similar to what we have in welding/.

[1] I would like to upload the device objects of our group so that users can link there measurements to these devices and other groups can see what we have got. We had originally a very broad INSTRUMENT.ANALYTICAL_SYSTEM, got more specific now and also could set up INSTURMENT.CHROMATOGRAPHY as an intermediate solution (but would potentially blow this up with type specific properties in the long run). All solutions have advantages and disadvantages... :/

[2] fine with me, especially when we can expect more specific property sets to appear in the future

--> once you made decisions on the above issues, I would modify the PR accordingly, correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datamodel Changes in the datamodel

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants