Core Scientific Metadata Model (CSMD)

The Core Scientific Meta-Data Model (CSMD) is a study-data oriented model which has been developed at STFC over many years. It captures high level information about scientific studies and the data that they produce. The ICAT schema finds its origins in CSMD, though it has a number of additional implementation specific features.

The CSMD is developed to support data collected within a facility's scientific workflow. However the model is also designed to be generic across scientific disciplines and has application beyond facilities science, particularly in the "structural sciences" (such as chemistry, material science, earth science, and biochemistry) which are concerned with analysing the structure of substances, and perform systematic experimental analyses on samples of those materials.

The model is organised around a notion of Studies, a study being a body of scientific work on a particular subject of investigation. During a study, a scientist would perform a number of investigations e.g. experiments, observations, measurements and simulations. Results from these investigations usually run through different stages: raw data, analysed or derived data and end results suitable for publication. The model has a hierarchical model of the structure of scientific research programmes, projects and studies, and also a generic model of the organisation of data sets into collections and files.

Documentation on the Core Scientific Metadata Model

The latest version of the CSMD is version 4.0 from March 2013. The main extension on earlier versions of the model is enhanced support for linking analysed data and software into the investigation.

For an earlier published paper see:

The earlier version 2.0 is described in:

A discussion on extending the model to consider capturing links to derived data is in:


Brian Matthews, STFC,
August 2015