CSMD: the Core Scientific Metadata Model

IRI:
http://purl.org/net/CSMD/
Date:
28/03/2013
Current version:
4.0
Previous version:
Authors:
Brian Matthews, STFC
Contributors:
Steve Fisher, STFC

Abstract

CSMD, the Core Scientific MetaData model, is an model of metadata to capture a description of Scientific "activities" (e.g. experiments, observations, simulations etc) which are characterised by an event where the application of resources (e.g. equipment, instruments) to a subject item (e.g. a chemical, geological or materials sample or biological specimin) causes a signal which is detected via sensors, and results in the the collection of data.

The CSMD is currently used largely within the context of large-scale scientific facilities (e.g. Photon and Neutron Sources) and has some characterisitics which are specialised to those contexts. However, the model is intended to have a general applciation and thus these features are kept to a minumum.

A metadata model for Facilities Science

The Core Scientific Metadata Model (CSMD) is a metadata model oriented towards facilities science which has been developed at STFC over the last 12 years; for earlier work see [Sufi and Matthews 2004] [Sufi and Matthews 2005] [Matthews, B., et. al 2009]. The CSMD is being used as the core metadata model within the data management infrastructure which is being developed for the large scale scientific facilities supported by STFC including the ISIS Neutron Source and the Diamond Light Source, and now being used across the PaN-Data consortium and elsewhere. It has been the result of an analysis of science practice over a number of years and a number of projects to allow the user to manage their own data, and have access to other interesting data.

The currently version is CSMD v.4.0, and the ICAT data catalogue is based on this version, though the database schema of ICAT has a number of modifications and additions to accommodate the practical implementation of the database.

The model is intended to capture high level information about investigations undertaken at facilities and the data that they produce. However, it is designed to be generic across scientific disciplines and has application beyond facilities science, particularly in the "structural sciences" (such as chemistry, material science, earth science, and biochemistry) which are concerned with the molecular structure of substances, and within which systematic experimental analyses are undertaken on material samples.

The core metadata model

CSMD is organised around the notion of Studies, a study being a body of scientific work on a particu-lar subject of interest. During a study, a scientist would perform a number of activities e.g. experiments, observations, measurements and simulations. Results from these activities usually run through different stages: the collection of raw data, the generation of analysed or derived data through the application of software tools, and end results. Data should be grouped accordingly, and associated with the appropriate experimental parameters. Not all information captured in specific metadata schemas would be used to search for this data or distinguish one data set from another, giving the possibility to select special parameters. The CSMD is designed to be a common general format/standard for Scientific Studies and their associated data holdings.

Thus this model:

The CSMD has been developed to be a core system which is extensible and can be specialised to particular scientific domains, so it does not make assumptions about the specific terminology of the domain.

The use of the CSMD for facilities is focused on capturing the activities associated with the facilities science lifecycle as detailed in [PaNData-ODI D6.1], which generically is called an Investigation . Thus the Investigation concept represents those activities and research outputs associated with one approved application for use of the facilities. In practise, this may involve a number of visits to use a number of instruments, analysing a number of samples and generating a number of resultant data sets.

The model thus defines a hierarchical model of the structure of scientific research around studies and investigations, with their associated information, and also a generic model of the organisation of data sets into collections and files. Specific data sets can be associated with the appropriate experimental parameters, and details of the files holding the actual data, including their location for linking. This provides a detailed description of the study, although not all information captured in specific metadata schemas would be used to search for this data or distinguish one data set from another.

UML Diagram of the main entities of the CSMD

Figure 1: Main entities of the CSMD

9

The metadata within the general structure is laid in a series of classes and subclasses. We do not describe the whole model in detail for reasons of space, but rather select some areas of particular interest. The core entities of the CSMD for a study are given in Figure 1, and are summarised as follows.

Study
The fundamental unit of the CSMD Model, collecting togehter an aggregation of associated scientific activities, data collecting events, artefacts (e.g. instruments, samples, dataset) and people.
Investigation
The fundamental unit of a facility study, associated with an accepted proposal for use of the facility to undertake a series of experiments. Attributes including a title, abstract, dates, and unique identifiers referencing the particular investigation.
Facility
The facility which is used within the investigation, hosting the instrument and experiments.
Instrument
The instruments the investigation uses to carry out experiments.
Sample
Information on the material sample analysed within the investigation. The model has attributes for a sample's name, chemical formula and any associated special information, such as specific safety information on a toxic material.
Dataset
One or more datasets can be associated with an investigation, representing different runs or analyses on a sample. Initially a raw data set can be attached to the investigation, but subsequently, analysed datasets can also be added.
Datafile
The CSMD takes a hierarchical view of data holdings, as data sets may contain other dataset as well as units of storage, typically datafiles. Each datafile has more detailed attributes, including its name, version, location, data format, creation and modification time, and fixity information such as a Checksum.
Parameter
Parameters describe measureable quantities associated with the investigation, such as temperature, pressure, or scattering angle, describing either the parameters of the sample, the environment the data was collected in, or the parameters being measured. Parameters can be associated at different levels: the investigation, the sample, dataset or the datafile, and have attributes for names, units, values, and allowable data ranges. These different type of parameter are all represented as subclasses of the general Parameter class. Parameters in CSMD are thus a very general data holding component (essentially a key-value pair table)/ Parameters thus provides a highly flexible and tailoroable mechanims for representing information about many type of study.
InvestigationUser
A user (represented as a separate entity, omitted for clarity) associated with a role in the investigation.

Additionally, a number of other entities are defined to capture for example, associated publications, data format used, sample types (representing the class of material under analysis, of which the sample is a particular instance), parameter types (representing the classes of parameters of which the parameters are particular instances).

Further entities capture specialist facility concepts, such as "shifts" (representing a daily time period within which the experiment was undertaken) and "cycles" (representing a period of weeks or months where the facility is in active operation between "shutdown" periods of maintenance).

A full UML class diagram of the model is given in Figure 2.

UML Class Diagram for the CSMD

Figure 2: UML Class Diagram for the full CSMD

OWL representation

We support this metadata model by providing a representation as an OWL ontology. This allows us to represent metadata as RDF triples within triple stores (or provide a triple based front end onto metadata databases such as ICAT via for example a SPARQL endpoint. We can also publish data about experiments into Linked Open Data, furthering the publication, exchange and sharing of metadata about facilities experiments, as well as its combination with other metadata items. We give a sample of the OWL representation, for reasons of brevity. The full model can be found on the ICAT Google Code site.

The OWL representation has a base URI: http://www.purl.org/net/CSMD/4.0#

The OWL representation reflects the UML model closely. Thus for each UML there is a corre-sponding OWL Class, as below (using RDF/XML notation):

   
    <!-- csmd:Investigation -->

<owl:Class rdf:about="csmd:Investigation">
<rdfs:label>Investigation</rdfs:label>
<rdfs:comment>An investigation or experiment</rdfs:comment>
</owl:Class>

<!-- csmd:Facility -->

<owl:Class rdf:about="csmd:Facility">
<rdfs:label>Facility</rdfs:label>
<rdfs:comment>An experimental facility</rdfs:comment>
</owl:Class>

<!-- csmd:Dataset -->

<owl:Class rdf:about="csmd:Dataset">
<rdfs:label>Dataset</rdfs:label>
<rdfs:comment>A collection of data files and part of an investiga-tion</rdfs:comment>
</owl:Class>

<!-- csmd:Datafile -->

<owl:Class rdf:about="csmd:Datafile">
<rdfs:label>Datafile</rdfs:label>
<rdfs:comment>A data file</rdfs:comment>
</owl:Class>

For each Class attribute in the UML model, there is a corresponding OWL DatatypeProperty, conventionally named by prefixing the domain Class name with the attribute name as in the below:

    <!-- csmd:investigation_startDate -->

<owl:DatatypeProperty rdf:about="csmd:investigation_startDate">
<rdf:type rdf:resource="&owl;FunctionalProperty"/>
<rdfs:label>investigation_startDate</rdfs:label>
<rdfs:comment>The time at which the investiation was initiat-ed</rdfs:comment>
<rdfs:domain rdf:resource="csmd:Investigation"/>
<rdfs:range rdf:resource="&xsd;dateTime"/>
</owl:DatatypeProperty>

<!-- csmd:investigation_summary -->

<owl:DatatypeProperty rdf:about="csmd:investigation_summary">
<rdfs:label>investigation_summary</rdfs:label>
<rdfs:comment>Summary or abstract</rdfs:comment>
<rdfs:domain rdf:resource="csmd:Investigation"/>
<rdfs:range rdf:resource="&xsd;string"/>
</owl:DatatypeProperty>

<!-- csmd:investigation_title -->

<owl:DatatypeProperty rdf:about="csmd:investigation_title">
<rdfs:label>investigation_title</rdfs:label>
<rdfs:comment>Full title of the investigation</rdfs:comment>
<rdfs:domain rdf:resource="csmd:Investigation"/>
<rdfs:range rdf:resource="&xsd;string"/>
</owl:DatatypeProperty>

Similarly, for each association in the UML model, there are two corresponding OWL ObjectProperties which form a pair of inverse Properties , conventionally named by prefixing the domain Class name with the range Class name. This is because whilst the associations in UML are not directed, properties in OWL are. This results for example in the below:

    <!-- csmd:facility_instrument -->

<owl:ObjectProperty rdf:about="csmd:facility_instrument">
<rdfs:label>facility_instrument</rdfs:label>
<rdfs:comment>An Instrument supported by a facility.</rdfs:comment>
<rdfs:domain rdf:resource="csmd:Facility"/>
<rdfs:range rdf:resource="csmd:Instrument"/>
<owl:inverseOf rdf:resource="csmd:instrument_facility"/>
</owl:ObjectProperty>

<!-- csmd:instrument_facility -->

<owl:ObjectProperty rdf:about="csmd:instrument_facility">
<rdf:type rdf:resource="&owl;FunctionalProperty"/>
<rdfs:label>instrument_facility</rdfs:label>
<rdfs:comment>The facility which has this instrument.</rdfs:comment>
<rdfs:range rdf:resource="csmd:Facility"/>
<rdfs:domain rdf:resource="csmd:Instrument"/>
</owl:ObjectProperty>

<!-- csmd:investigation_dataset -->

<owl:ObjectProperty rdf:about="csmd:investigation_dataset">
<rdf:type rdf:resource="&owl;InverseFunctionalProperty"/>
<rdfs:label>investigation_dataset</rdfs:label>
<rdfs:comment>A data set which is the result of an investiga-tion</rdfs:comment>
<rdfs:range rdf:resource="csmd:Dataset"/>
<rdfs:domain rdf:resource="csmd:Investigation"/>
</owl:ObjectProperty>

<!-- csmd:dataset_investigation -->

<owl:ObjectProperty rdf:about="csmd:dataset_investigation">
<rdf:type rdf:resource="&owl;FunctionalProperty"/>
<rdfs:label>dataset_investigation</rdfs:label>
<rdfs:comment></rdfs:comment>
<rdfs:comment>The investigation associated with an da-taset</rdfs:comment>
<rdfs:domain rdf:resource="csmd:Dataset"/>
<rdfs:range rdf:resource="csmd:Investigation"/>
<owl:inverseOf rdf:resource="csmd:investigation_dataset"/>
</owl:ObjectProperty>

Note also that some object properties are declared as OWL Functional (or inverseFunctional) properties; this reflects the cardinality constraints of the UML model into the OWL model.

Supporting Provenance

In order to support provenance, we need to relate the model of facility experiments encapsulated in the "data-centric" view expressed in the CSMD with a notion of scientific activity, so that we can provide a notion of the further processes and outputs involved with managing the data. We relate the model with the general notion of a process given in the W3C Prov model [PROV-O], which gives a high level view of a process step as in Figure 3

High level view of the structure of Prov records

Figure 3: High level view of the structure of Prov records

Thus the Prov model defines three general classes :

Then the general properties relate these class instances together to represent the relationships between entities and how they were processed by whom to form a record of provenance. Note that, in common with many approaches to representing provenance, these relationships take the point of view of an historical record of the origins of entities, asking the question "where does this entity come from", so the arrows point from the future to the past.

In the CSMD model, Entities include datasets, datafiles, samples, and indeed, investigations themselves as a conceptual object representing the entirety of the experiment while the InvestigationUser and instruments are Agents. We can add these relationships into the OWL model by making the appropriate CSMD OWL classes subclasses of the PROV-O classes.

The provenance step for which we have the most immediate need is the use of an analytic software package on datasets or specific data files, to generate new datasets and datafiles. Consequently, a new class Job has been added to CSMD to represent the Activity (in the PROV sense) of running a software package on a set of input data to derive a set of output data. We represent this in UML as in Figure 4.

Modelling jobs in CSMD

Figure 4: Modelling jobs in CSMD

Thus a Job is an Activity which is associated with a software application as an Agent (again in the PROV-O sense) and takes a number of input data sets and data files within those data sets as inputs, and outputs data files and dataset as output. These data set are themselves linked into the Investigation. Thus inputDataSet and inputDataFile are associations which are specialisations of the PROV-O uses property, and outputDataSet and outputDataFile are associations which are specialisations of the PROV-O wasGeneratedBy property (with appropriate directionality).

Generalising the Activity Model

We can generalize this notion of Scientific Activity in the CSMD model to add a general class of activities, and add other types of activity. Thus we can add a "Run" activity to the CSMD representing the particular activity of generating a dataset from a sample using an instrument (an agent), as in Figure 5.

A Run model for CSMD

Figure 5: A Run model for CSMD

We would also relate the Investigation to the Run. Adding a Run entity to the CSMD is not included within CSMD 4.0, and is still under discussion as it not as yet accepted that this adds to the current use of ICAT; however, it would make the model more consistent in the representation of provenance.

Description of classes and properties

We give a complete description of the classes and properties in the CSMD model.

Ontology Classes

Class Id csmd:Application
Class Name: Application
Description: Some piece of software

Class Id csmd:Datafile
Class Name: Datafile
Description: A data file

Class Id csmd:DatafileFormat
Class Name: DatafileFormat
Description: A data file format

Class Id csmd:DatafileParameter
Class Name: DatafileParameter
Description: A parameter associated with a data file

Class Id csmd:Dataset
Class Name: Dataset
Description: A collection of data files and part of an investigation

Class Id csmd:DatasetParameter
Class Name: DatasetParameter
Description: A parameter associated with a data set

Class Id csmd:DatasetType
Class Name: DatasetType
Description: A type of data set

Class Id csmd:Facility
Class Name: Facility
Description: An experimental facility

Class Id csmd:FacilityCycle
Class Name: FacilityCycle
Description: An operating cycle within a facility

Class Id csmd:Instrument
Class Name: Instrument
Description: Used by a user within an investigation

Class Id csmd:Investigation
Class Name: Investigation
Description: An investigation or experiment

Class Id csmd:InvestigationParameter
Class Name: InvestigationParameter
Description: A parameter associated with an investigation

Class Id csmd:InvestigationType
Class Name: InvestigationType
Description: A type of investigation

Class Id csmd:InvestigationUser
Class Name: InvestigationUser
Description: Many to many relationship between investigation and user

Class Id csmd:Job
Class Name: Job
Description: A run of an application with its related inputs and outputs

Class Id csmd:Keyword
Class Name: Keyword
Description: Must be related to an investigation

Class Id csmd:Parameter
Class Name: Parameter
Description: A parameter associated with an entity

Class Id csmd:ParameterType
Class Name: ParameterType
Description: A parameter type with unique name and units

Class Id csmd:PermissibleStringValue
Class Name: PermissibleStringValue
Description: Permissible value for string parameter types

Class Id csmd:Publication
Class Name: Publication
Description: A publication

Class Id csmd:RelatedDatafile
Class Name: RelatedDatafile
Description: Used to represent an arbitrary relationship between data files

Class Id csmd:Sample
Class Name: Sample
Description: A sample to be used in an investigation

Class Id csmd:SampleParameter
Class Name: SampleParameter
Description: A parameter associated with a sample

Class Id csmd:SampleType
Class Name: SampleType
Description: A sample to be used in an investigation

Class Id csmd:Shift
Class Name: Shift
Description: A period of time related to an investigation

Class Id csmd:Study
Class Name: Study
Description: A study which may be related to an investigation

Class Id csmd:User
Class Name: User
Description: A user of the facility

Data Properties

Property Id csmd:application_name
Property Name: application_name
Domain: csmd:Application
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name for the software - e.g. mantid

Property Id csmd:application_version
Property Name: application_version
Domain: csmd:Application
Range: http://www.w3.org/2001/XMLSchema#string
Description:

Property Id csmd:datafile_checksum
Property Name: datafile_checksum
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#string
Description: Checksum of file represented as a string

Property Id csmd:datafile_datafileCreateTime
Property Name: datafile_datafileCreateTime
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: Date of creation of the actual file rather than storing the metadata

Property Id csmd:datafile_datafileModTime
Property Name: datafile_datafileModTime
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: Date of modification of the actual file rather than of the metadata

Property Id csmd:datafile_description
Property Name: datafile_description
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#string
Description: A full description of the file contents

Property Id csmd:datafile_doi
Property Name: datafile_doi
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#string
Description: The Digital Object Identifier associated with this data file

Property Id csmd:datafile_fileSize
Property Name: datafile_fileSize
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#long
Description: File size expressed in bytes

Property Id csmd:datafile_location
Property Name: datafile_location
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#string
Description: The logical location of the file - which may also be the physical location

Property Id csmd:datafile_name
Property Name: datafile_name
Domain: csmd:Datafile
Range: http://www.w3.org/2001/XMLSchema#string
Description: A name given to the file

Property Id csmd:datafileformat_description
Property Name: datafileformat_description
Domain: csmd:DatafileFormat
Range: http://www.w3.org/2001/XMLSchema#string
Description: An informal description of the format

Property Id csmd:datafileformat_name
Property Name: datafileformat_name
Domain: csmd:DatafileFormat
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying the format -e.g. "mp3" within the facility

Property Id csmd:datafileformat_type
Property Name: datafileformat_type
Domain: csmd:DatafileFormat
Range: http://www.w3.org/2001/XMLSchema#string
Description: Holds the underlying format - such as binary or text

Property Id csmd:datafileformat_version
Property Name: datafileformat_version
Domain: csmd:DatafileFormat
Range: http://www.w3.org/2001/XMLSchema#string
Description: The version if needed. The version code may be part of the basic name

Property Id csmd:dataset_complete
Property Name: dataset_complete
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: May be set to true when all data files and parameters have been added to the data set. The precise meaning is facility dependent.

Property Id csmd:dataset_description
Property Name: dataset_description
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#string
Description: An informal description of the data set

Property Id csmd:dataset_doi
Property Name: dataset_doi
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#string
Description: The Digital Object Identifier associated with this data set

Property Id csmd:dataset_endDate
Property Name: dataset_endDate
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: Time that the data set was last updated.

Property Id csmd:dataset_location
Property Name: dataset_location
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#string
Description: Identifies a location from which all the files of the data set might be accessed. It might be a directory

Property Id csmd:dataset_name
Property Name: dataset_name
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name for the data set

Property Id csmd:dataset_startDate
Property Name: dataset_startDate
Domain: csmd:Dataset
Range: http://www.w3.org/2001/XMLSchema#date
Description: The time that a dataset is created.

Property Id csmd:datasettype_description
Property Name: datasettype_description
Domain: csmd:DatasetType
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of this data set type

Property Id csmd:datasettype_name
Property Name: datasettype_name
Domain: csmd:DatasetType
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying this data set type within the facility

Property Id csmd:facility_daysUntilRelease
Property Name: facility_daysUntilRelease
Domain: csmd:Facility
Range: http://www.w3.org/2001/XMLSchema#integer
Description: The number of days before data is made freely available after collecting it.

Property Id csmd:facility_description
Property Name: facility_description
Domain: csmd:Facility
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of this facility

Property Id csmd:facility_fullName
Property Name: facility_fullName
Domain: csmd:Facility
Range: http://www.w3.org/2001/XMLSchema#string
Description: The full name of the facility

Property Id csmd:facility_name
Property Name: facility_name
Domain: csmd:Facility
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying this facility

Property Id csmd:facility_url
Property Name: facility_url
Domain: csmd:Facility
Range: http://www.w3.org/2001/XMLSchema#string
Description: A URL associated with this facility

Property Id csmd:facilitycycle_description
Property Name: facilitycycle_description
Domain: csmd:FacilityCycle
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of this facility cycle

Property Id csmd:facilitycycle_endDate
Property Name: facilitycycle_endDate
Domain: csmd:FacilityCycle
Range: http://www.w3.org/2001/XMLSchema#date
Description: End of cycle

Property Id csmd:facilitycycle_name
Property Name: facilitycycle_name
Domain: csmd:FacilityCycle
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying this facility cycle within the facility

Property Id csmd:facilitycycle_startDate
Property Name: facilitycycle_startDate
Domain: csmd:FacilityCycle
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: Start of cycle

Property Id csmd:instrument_description
Property Name: instrument_description
Domain: csmd:Instrument
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of this instrument

Property Id csmd:instrument_fullName
Property Name: instrument_fullName
Domain: csmd:Instrument
Range: http://www.w3.org/2001/XMLSchema#string
Description: The formal name of this instrument

Property Id csmd:instrument_name
Property Name: instrument_name
Domain: csmd:Instrument
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying this instrument within the facility

Property Id csmd:instrument_type
Property Name: instrument_type
Domain: csmd:Instrument
Range: http://www.w3.org/2001/XMLSchema#string
Description: The type of the instrument - e.g. spectrometer etc. Also should refer to the technique (?)

Property Id csmd:investigation_doi
Property Name: investigation_doi
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#string
Description: The Digital Object Identifier associated with this investigation

Property Id csmd:investigation_endDate
Property Name: investigation_endDate
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: The latest date of change to the investigation

Property Id csmd:investigation_name
Property Name: investigation_name
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name for the investigation

Property Id csmd:investigation_releaseDate
Property Name: investigation_releaseDate
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#date
Description: A date when the data will be made freely available after an embargo period

Property Id csmd:investigation_startDate
Property Name: investigation_startDate
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: The time at which the investiation was initiated

Property Id csmd:investigation_summary
Property Name: investigation_summary
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#string
Description: Summary or abstract

Property Id csmd:investigation_title
Property Name: investigation_title
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#string
Description: Full title of the investigation

Property Id csmd:investigation_visitId
Property Name: investigation_visitId
Domain: csmd:Investigation
Range: http://www.w3.org/2001/XMLSchema#string
Description: Identifier for the visit to which this investigation is related

Property Id csmd:investigationtype_description
Property Name: investigationtype_description
Domain: csmd:InvestigationType
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of this type of investigation

Property Id csmd:investigationtype_name
Property Name: investigationtype_name
Domain: csmd:InvestigationType
Range: http://www.w3.org/2001/XMLSchema#string
Description: A short name identifying this type of investigation

Property Id csmd:investigationuser_role
Property Name: investigationuser_role
Domain: csmd:InvestigationUser
Range: http://www.w3.org/2001/XMLSchema#string
Description:

Property Id csmd:keyword_name
Property Name: keyword_name
Domain: csmd:Keyword
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of the keyword

Property Id csmd:parameter_dateTimeValue
Property Name: parameter_dateTimeValue
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: The value if the parameter is a date

Property Id csmd:parameter_error
Property Name: parameter_error
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#double
Description: The error of the numeric parameter

Property Id csmd:parameter_numericValue
Property Name: parameter_numericValue
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#double
Description: The value if the parameter is numeric

Property Id csmd:parameter_rangeBottom
Property Name: parameter_rangeBottom
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#double
Description: The minimum value of the numeric parameter that was observed during the measurement period

Property Id csmd:parameter_rangeTop
Property Name: parameter_rangeTop
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#double
Description: The maximum value of the numeric parameter that was observed during the measurement period

Property Id csmd:parameter_stringValue
Property Name: parameter_stringValue
Domain: csmd:Parameter
Range: http://www.w3.org/2001/XMLSchema#string
Description: The value if the parameter is a string

Property Id csmd:parametertype_applicableToDatafile
Property Name: parametertype_applicableToDatafile
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: If a parameter of this type may be applied to a data file

Property Id csmd:parametertype_applicableToDataset
Property Name: parametertype_applicableToDataset
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: If a parameter of this type may be applied to a data set

Property Id csmd:parametertype_applicableToInvestigation
Property Name: parametertype_applicableToInvestigation
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: If a parameter of this type may be applied to an investigation

Property Id csmd:parametertype_applicableToSample
Property Name: parametertype_applicableToSample
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: If a parameter of this type may be applied to a sample

Property Id csmd:parametertype_description
Property Name: parametertype_description
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#string
Description: Description of the parameter type

Property Id csmd:parametertype_enforced
Property Name: parametertype_enforced
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: True if constraints are enforced

Property Id csmd:parametertype_maximumNumericValue
Property Name: parametertype_maximumNumericValue
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#double
Description:

Property Id csmd:parametertype_minimumNumericValue
Property Name: parametertype_minimumNumericValue
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#double
Description:

Property Id csmd:parametertype_name
Property Name: parametertype_name
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of the parameter type

Property Id csmd:parametertype_units
Property Name: parametertype_units
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of the parameter type units

Property Id csmd:parametertype_unitsFullName
Property Name: parametertype_unitsFullName
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#string
Description: The formal name of the parameter type units

Property Id csmd:parametertype_valueType
Property Name: parametertype_valueType
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#string
Description: enum with possible values: NUMERIC, STRING, DATE_AND_TIME

Property Id csmd:parametertype_verified
Property Name: parametertype_verified
Domain: csmd:ParameterType
Range: http://www.w3.org/2001/XMLSchema#boolean
Description: If ordinary users are allowed to create their own parameter types this indicates that this one has been approved

Property Id csmd:permissiblestringvalue_value
Property Name: permissiblestringvalue_value
Domain: csmd:PermissibleStringValue
Range: http://www.w3.org/2001/XMLSchema#string
Description: The value of the string

Property Id csmd:publication_doi
Property Name: publication_doi
Domain: csmd:Publication
Range: http://www.w3.org/2001/XMLSchema#string
Description: The Digital Object Identifier associated with this publication

Property Id csmd:publication_fullReference
Property Name: publication_fullReference
Domain: csmd:Publication
Range: http://www.w3.org/2001/XMLSchema#string
Description: A reference in the form to be used for citation

Property Id csmd:publication_repository
Property Name: publication_repository
Domain: csmd:Publication
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of a repository where the publication is held

Property Id csmd:publication_repositoryId
Property Name: publication_repositoryId
Domain: csmd:Publication
Range: http://www.w3.org/2001/XMLSchema#string
Description: The id of the publication within the repository

Property Id csmd:publication_url
Property Name: publication_url
Domain: csmd:Publication
Range: http://www.w3.org/2001/XMLSchema#string
Description: A URL from which the publication may be downloaded

Property Id csmd:relateddatafile_relation
Property Name: relateddatafile_relation
Range: http://www.w3.org/2001/XMLSchema#string
Description: Identifies the type of relationship between the two datafiles - e.g. "COPY"

Property Id csmd:sample_name
Property Name: sample_name
Domain: csmd:Sample
Range: http://www.w3.org/2001/XMLSchema#string
Description:

Property Id csmd:sampletype_molecularFormula
Property Name: sampletype_molecularFormula
Domain: csmd:SampleType
Range: http://www.w3.org/2001/XMLSchema#string
Description: The formula written as a string -e.g. C2H6O2 for ethylene glycol

Property Id csmd:sampletype_name
Property Name: sampletype_name
Domain: csmd:SampleType
Range: http://www.w3.org/2001/XMLSchema#string
Description: Name of a type of sample

Property Id csmd:sampletype_safetyInformation
Property Name: sampletype_safetyInformation
Domain: csmd:SampleType
Range: http://www.w3.org/2001/XMLSchema#string
Description: Any safety information related to this sample

Property Id csmd:shift_comment
Property Name: shift_comment
Domain: csmd:Shift
Range: http://www.w3.org/2001/XMLSchema#string
Description:

Property Id csmd:shift_endDate
Property Name: shift_endDate
Domain: csmd:Shift
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description:

Property Id csmd:shift_startDate
Property Name: shift_startDate
Domain: csmd:Shift
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description:

Property Id csmd:study_description
Property Name: study_description
Domain: csmd:Study
Range: http://www.w3.org/2001/XMLSchema#string
Description: A description of the study and its purpose

Property Id csmd:study_endDate
Property Name: study_endDate
Domain: csmd:Study
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: The end date of this study

Property Id csmd:study_name
Property Name: study_name
Domain: csmd:Study
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of the study

Property Id csmd:study_startDate
Property Name: study_startDate
Domain: csmd:Study
Range: http://www.w3.org/2001/XMLSchema#dateTime
Description: The start date of this study

Property Id csmd:study_status
Property Name: study_status
Domain: csmd:Study
Range: http://www.w3.org/2001/XMLSchema#string
Description: The status of the study. Possible values are: NEW, IN_PROGRESS, COMPLETE, CANCELLED

Property Id csmd:user_fullName
Property Name: user_fullName
Domain: csmd:User
Range: http://www.w3.org/2001/XMLSchema#string
Description: Full name of a user - may include title

Property Id csmd:user_name
Property Name: user_name
Domain: csmd:User
Range: http://www.w3.org/2001/XMLSchema#string
Description: The name of the user to match that provided by the authentication mechanism

Object Properties

Property Id csmd:application_job
Property Name application_job
Domain csmd:Application
Range csmd:Job
Inverse of csmd:job_application
Description:

Property Id csmd:datafile_datafileFormat
Property Name datafile_datafileFormat
Domain csmd:Datafile
Range csmd:DatafileFormat
Inverse of csmd:datafileformat_datafile
Description:

Property Id csmd:datafile_dataset
Property Name datafile_dataset
Domain csmd:Datafile
Range csmd:Dataset
Inverse of csmd:dataset_datafile
Description: The dataset which holds this file

Property Id csmd:datafile_destDatafile
Property Name datafile_destDatafile
Domain csmd:Datafile
Range csmd:RelatedDatafile
Description: the destination data file in a datafile-datafile relation

Property Id csmd:datafile_parameter
Property Name datafile_parameter
Domain csmd:Datafile
Range csmd:DatafileParameter
Description: Association of a datafile and a parameter associated with that datafile

Property Id csmd:datafile_sourceDatafile
Property Name datafile_sourceDatafile
Domain csmd:Datafile
Range csmd:RelatedDatafile
Description: the destination data file in a datafile-datafile relation

Property Id csmd:datafileformat_datafile
Property Name datafileformat_datafile
Domain csmd:DatafileFormat
Range csmd:Datafile
Description: Files with this format

Property Id csmd:datafileformat_facility
Property Name datafileformat_facility
Domain csmd:DatafileFormat
Range csmd:Facility
Inverse of csmd:facility_datafileFormat
Description: The facility which has defined this format

Property Id csmd:datafileparameter_datafile
Property Name datafileparameter_datafile
Domain csmd:DatafileParameter
Range csmd:Datafile
Inverse of csmd:datafile_parameter
Description: The associated data file

Property Id csmd:dataset_datafile
Property Name dataset_datafile
Domain csmd:Dataset
Range csmd:Datafile
Description: A data file within the dataset

Property Id csmd:dataset_investigation
Property Name dataset_investigation
Domain csmd:Dataset
Range csmd:Investigation
Inverse of csmd:investigation_dataset
Description: The investigation associated with an dataset

Property Id csmd:dataset_parameter
Property Name dataset_parameter
Domain csmd:Dataset
Range csmd:DatasetParameter
Description: A Parameter associated with a Dataset

Property Id csmd:dataset_sample
Property Name dataset_sample
Domain csmd:Dataset
Range csmd:Sample
Inverse of csmd:sample_dataset
Description: A Sample associated with a dataset

Property Id csmd:dataset_type
Property Name dataset_type
Domain csmd:Dataset
Range csmd:DatasetType
Inverse of csmd:datasettype_dataset
Description: The type of a dataset

Property Id csmd:datasetparameter_dataset
Property Name datasetparameter_dataset
Domain csmd:DatasetParameter
Range csmd:Dataset
Inverse of csmd:dataset_parameter
Description: The associated data set

Property Id csmd:datasettype_dataset
Property Name datasettype_dataset
Domain csmd:DatasetType
Range csmd:Dataset
Description: A dataset which is of a given type.

Property Id csmd:datasettype_facility
Property Name datasettype_facility
Domain csmd:DatasetType
Range csmd:Facility
Inverse of csmd:facility_datasetType
Description: The facility which has defined this data set type

Property Id csmd:facility_datafileFormat
Property Name facility_datafileFormat
Domain csmd:Facility
Range csmd:DatafileFormat
Description: A datafile format supported by a facility

Property Id csmd:facility_datasetType
Property Name facility_datasetType
Domain csmd:Facility
Range csmd:DatasetType
Description: The types of datasets supported by a facility

Property Id csmd:facility_facilityCycle
Property Name facility_facilityCycle
Domain csmd:Facility
Range csmd:FacilityCycle
Inverse of csmd:facilitycycle_facility
Description: A Facility cycle (period of facility availability) offered by a facility.

Property Id csmd:facility_instrument
Property Name facility_instrument
Domain csmd:Facility
Range csmd:Instrument
Inverse of csmd:instrument_facility
Description: An Instrument supported by a facility.

Property Id csmd:facility_investigation
Property Name facility_investigation
Domain csmd:Facility
Range csmd:Investigation
Inverse of csmd:investigation_facility
Description: An Investigation undertaken within a facility

Property Id csmd:facility_investigationType
Property Name facility_investigationType
Domain csmd:Facility
Range csmd:InvestigationType
Inverse of csmd:investigationtype_facility
Description: An Investigation Type supported by a facility

Property Id csmd:facility_parameterType
Property Name facility_parameterType
Domain csmd:Facility
Range csmd:ParameterType
Inverse of csmd:parametertype_facility
Description: A parameter type supported by a facility

Property Id csmd:facility_sampleType
Property Name facility_sampleType
Domain csmd:Facility
Range csmd:SampleType
Inverse of csmd:sampletype_facility
Description: A Sample type supported by a facility

Property Id csmd:facilitycycle_facility
Property Name facilitycycle_facility
Domain csmd:FacilityCycle
Range csmd:Facility
Description: The facility which has this cycle

Property Id csmd:facilitycycle_investigation
Property Name facilitycycle_investigation
Domain csmd:FacilityCycle
Range csmd:Investigation
Inverse of csmd:investigation_facilityCycle
Description: an investigation within a particular facility cycle

Property Id csmd:inputdatafile
Property Name inputdatafile
Domain csmd:Job
Range csmd:Datafile
Description: an input datafile of a job

Property Id csmd:inputdataset
Property Name user_studiesinputdataset
Domain csmd:Job
Range csmd:Dataset
Description: an input dataset of a job

Property Id csmd:instrument_facility
Property Name instrument_facility
Domain csmd:Instrument
Range csmd:Facility
Description: The facility which has this instrument

Property Id csmd:instrument_instrumentScientist
Property Name instrument_instrumentScientist
Domain csmd:Instrument
Range csmd:User
Inverse of csmd:instrumentscientist_instrument
Description: An instrument scientist is a particular user with expertise on the instrument

Property Id csmd:instrument_investigation
Property Name instrument_investigation
Domain csmd:Instrument
Range csmd:Investigation
Inverse of csmd:investigation_instrument
Description: an investigation which has be undertaken on an instrument

Property Id csmd:instrumentscientist_instrument
Property Name instrumentscientist_instrument
Domain csmd:User
Range csmd:Instrument
Description:

Property Id csmd:investigation_dataset
Property Name investigation_dataset
Domain csmd:Investigation
Range csmd:Dataset
Description:

Property Id csmd:investigation_facility
Property Name investigation_facility
Domain csmd:Investigation
Range csmd:Facility
Description: The facility an investigation takes place within

Property Id csmd:investigation_facilityCycle
Property Name investigation_facilityCycle
Domain csmd:Investigation
Range csmd:FacilityCycle
Description: The facility cycle an investigation takes place in

Property Id csmd:investigation_instrument
Property Name investigation_instrument
Domain csmd:Investigation
Range csmd:Instrument
Description: The instrument used in an investigation

Property Id csmd:investigation_investigationUser
Property Name investigation_investigationUser
Domain csmd:Investigation
Range csmd:InvestigationUser
Inverse of csmd:investigationuser_investigation
Description: The user which contributes to an investigation

Property Id csmd:investigation_keyword
Property Name investigation_keyword
Domain csmd:Investigation
Range csmd:Keyword
Inverse of csmd:keyword_investigation
Description: A keyword associated with an investigation

Property Id csmd:investigation_parameter
Property Name investigation_parameter
Domain csmd:Investigation
Range csmd:InvestigationParameter
Inverse of csmd:investigationparameter_investigation
Description: A parameter associated with an investigation

Property Id csmd:investigation_publication
Property Name investigation_publication
Domain csmd:Investigation
Range csmd:Publication
Inverse of csmd:publication_investigation
Description: A publication associated with an investigation

Property Id csmd:investigation_sample
Property Name investigation_sample
Domain csmd:Investigation
Range csmd:Sample
Inverse of csmd:sample_investigation
Description: A sample used within an investigation

Property Id csmd:investigation_shift
Property Name investigation_shift
Domain csmd:Investigation
Range csmd:Shift
Inverse of csmd:shift_investigation
Description: A shift in which the investifation took place.

Property Id csmd:investigation_study
Property Name investigation_study
Domain csmd:Investigation
Range csmd:Study
Inverse of csmd:study_investigation
Description: Association of a Study to an investigation

Property Id csmd:investigation_type
Property Name investigation_type
Domain csmd:Investigation
Range csmd:InvestigationType
Inverse of csmd:investigationtype_investigation
Description: The type of an investigation

Property Id csmd:investigationparameter_investigation
Property Name investigationparameter_investigation
Domain csmd:InvestigationParameter
Range csmd:Investigation
Description: The associated investigationThe associated investigation of a parameter

Property Id csmd:investigationtype_facility
Property Name investigationtype_facility
Domain csmd:InvestigationType
Range csmd:Facility
Description: The facility which has defined this investigation type

Property Id csmd:investigationtype_investigation
Property Name investigationtype_investigation
Domain csmd:InvestigationType
Range csmd:Investigation
Description: An investigation witha a particular type

Property Id csmd:investigationuser_investigation
Property Name investigationuser_investigation
Domain csmd:InvestigationUser
Range csmd:Investigation
Description: The investigation a user contributes to in a particular role

Property Id csmd:investigationuser_user
Property Name investigationuser_user
Domain csmd:InvestigationUser
Range csmd:User
Inverse of csmd:user_investigationUser
Description:

Property Id csmd:job_application
Property Name job_application
Domain csmd:Job
Range csmd:Application
Description: A Software Application used to run a job

Property Id csmd:keyword_investigation
Property Name keyword_investigation
Domain csmd:Keyword
Range csmd:Investigation
Description: The investigation to which this keyword applies

Property Id csmd:outputdatafile
Property Name outputdatafile
Domain csmd:Job
Range csmd:Datafile
Description: an output datafile of a job

Property Id csmd:outputdataset
Property Name outputdataset
Domain csmd:Job
Range csmd:Dataset
Description: the output datasetof a job

Property Id csmd:parameter_type
Property Name parameter_type
Domain csmd:Parameter
Range csmd:ParameterType
Description: The type of the parameter

Property Id csmd:parametertype_facility
Property Name parametertype_facility
Domain csmd:ParameterType
Range csmd:Facility
Description: The facility which has defined this data set type

Property Id csmd:parametertype_permissiblestringvalue
Property Name parametertype_permissiblestringvalue
Domain csmd:ParameterType
Range csmd:PermissibleStringValue
Inverse of csmd:permissiblestringvalue_type
Description:

Property Id csmd:permissiblestringvalue_type
Property Name permissiblestringvalue_type
Domain csmd:PermissibleStringValue
Range csmd:ParameterType
Description: The parameter type to which this permissible string value applies

Property Id csmd:publication_investigation
Property Name publication_investigation
Domain csmd:Publication
Range csmd:Investigation
Description:

Property Id csmd:relateddatafile_destDatafile
Property Name relateddatafile_destDatafile
Domain csmd:RelatedDatafile
Range csmd:Datafile
Description:

Property Id csmd:relateddatafile_sourceDatafile
Property Name relateddatafile_sourceDatafile
Domain csmd:RelatedDatafile
Range csmd:Datafile
Description:

Property Id csmd:sample_dataset
Property Name sample_dataset
Domain csmd:Sample
Range csmd:Dataset
Description: A dataset related to a sample

Property Id csmd:sample_investigation
Property Name sample_investigation
Domain csmd:Sample
Range csmd:Investigation
Description:

Property Id csmd:sample_parameter
Property Name sample_parameter
Domain csmd:Sample
Range csmd:SampleParameter
Inverse of csmd:sampleparameter_sample
Description: A Parameter associated with a sample

Property Id csmd:sample_type
Property Name sample_type
Domain csmd:Sample
Range csmd:SampleType
Inverse of csmd:sampletype_sample
Description:

Property Id csmd:sampleparameter_sample
Property Name sampleparameter_sample
Domain csmd:SampleParameter
Range csmd:Sample
Description: The associated sample

Property Id csmd:sampletype_facility
Property Name sampletype_facility
Domain csmd:SampleType
Range csmd:Facility
Description: The facility which has defined this sample type

Property Id csmd:sampletype_sample
Property Name sampletype_sample
Domain csmd:SampleType
Range csmd:Sample
Description:

Property Id csmd:shift_investigation
Property Name shift_investigation
Domain csmd:Shift
Range csmd:Investigation
Description:

Property Id csmd:study_investigation
Property Name study_investigation
Domain csmd:Study
Range csmd:Investigation
Description: An investigation in a study

Property Id csmd:study_user
Property Name study_user
Domain csmd:Study
Range csmd:User
Description: The user responsible for the study

Property Id csmd:user_investigationUser
Property Name user_investigationUser
Domain csmd:User
Range csmd:InvestigationUser
Description:

Property Id csmd:user_study
Property Name user_study
Domain csmd:User
Range csmd:Study
Inverse of csmd:study_user
Description: The relationship between a user and a study they contribute to.

Property Id http://www.purl.org/net/CSMD/4.0#parametertype_parameter
Property Name
Domain csmd:ParameterType
Range csmd:Parameter
Inverse of csmd:parameter_type
Description:

References

[Matthews, B., et. al 2009]
Matthews, B., et. al. Using a Core Scientific Metadata Model in Large-Scale Facilities. 5th International Digital Curation Conference, London, UK, (2009)
[PROV-O]
PROV-O: The PROV Ontology, W3C Recommendation 30 April 2013, http://www.w3.org/TR/prov-o/
[PaNData-ODI D6.1]
Matthews, B. et al., Model of the data continuum in Photon and Neutron Facilities. PaNdata ODI, Deliverable D6.1. (2012). http://pan-data.eu/sites/pan-data.eu/files/PaNdataODI-D6.1.pdf
[Sufi and Matthews 2004]
Sufi, S., Matthews, B. CCLRC Scientific Metadata Model: Version 2. DL Technical Reports, DL-TR-2004-001, (2004). http://epubs.cclrc.ac.uk/work-details?w=30324
[Sufi and Matthews 2005]
Sufi, S., Matthews, B. The CCLRC Scientific Metadata Model: a metadata model for the exploitation of scientific studies and associated data. In Contributions in Knowledge and Data Management in Grids, eds. Domenico Talia, Angelos Bilas, Marios Dikaiakos, CoreGRID 3, Springer-Verlag, (2005).