02-introduction

home | prev | next

Introduction

The Configurable Data Curation System (CDCS) is a family of systems for structuring, searching, and sharing data. The two primary system types in this family are the Materials Data Curation System (MDCS) and the Materials Resource Registry (MRR) systems. The curator is for structuring and entering data into accessible online data repositories. The registry is for locating data in distributed data repositories. They were originally created in the materials science community under the Materials Genome Initiative (MGI).

MDCS 2.0 represents the first release of the 2.0 series curator in which all of the functionality from the previous major curator release (1.5) has been ported into a modular core of configurable applications. Throughout the remainder of the MDCS 2.0 series, functionality from the registry as well as functionality from early adopters and customizations of both curator and registry systems are being ported into the MDCS 2.0 core.

This document captures the essential functionality of the MDCS 2.0 curator system. It will be expanded and updated over time as more functionality becomes available.

Essential Functionality

The essential functionality of the curator includes the following:

Function	Description
General tasks	General user tasks such as requesting an account, logging in, etc.
Structuring data	Creating data entry templates with the composer application.
Entering data	Inputting data using data entry templates with the curator application.
Accessing data	Controlling data access via creation and use of users, groups, and workspaces.
Finding data	Searching for data locally and remotely.
Transforming data	Transforming data from one format or representation to another.
Managing resources	Managing resources and configurations on the curator platform.
Operations	Operational tasks such as deployment, creating custom themes, etc.

Essential Technology

The system's functionality is implemented using a python-based web-framework (Django) and a NoSQL database (MongoDB) as its primary data store. It uses XML technologies as its primary data representation for system artifacts such as data templates and types (XML schemas), data documents and records (XML documents), data transformations (using XSLT), data entry form generation (XML-based generation), and data query representation (XML-based).