TL;DR: The use of Common Data Elements can facilitate cross-study comparisons, data aggregation, and meta-analyses; simplify training and operations; improve overall efficiency; promote interoperability between different systems; and improve the quality of data collection.
Abstract: The use of Common Data Elements can facilitate cross-study comparisons, data aggregation, and meta-analyses; simplify training and operations; improve overall efficiency; promote interoperability between different systems; and improve the quality of data collection. A Common Data Element is a combination of a precisely defined question (variable) paired with a specified set of responses to the question that is common to multiple datasets or used across different studies. Common Data Elements, especially when they conform to accepted standards, are identified by research communities from variable sets currently in use or are newly developed to address a designated data need. There are no formal international specifications governing the construction or use of Common Data Elements. Consequently, Common Data Elements tend to be made available by research communities on an empiric basis. Some limitations of Common Data Elements are that there may still be differences across studies in the interpretation and implementation of the Common Data Elements, variable validity in different populations, and inhibition by some existing research practices and the use of legacy data systems. Current National Institutes of Health efforts to support Common Data Element use are linked to the strengthening of National Institutes of Health Data Sharing policies and the investments in data repositories. Initiatives include cross-domain and domain-specific resources, construction of a Common Data Element Portal, and establishment of trans-National Institutes of Health working groups to address technical and implementation topics. The National Institutes of Health is seeking to lower the barriers to Common Data Element use through greater awareness and encourage the culture change necessary for their uptake and use. As National Institutes of Health, other agencies, professional societies, patient registries, and advocacy groups continue efforts to develop and promote the responsible use of Common Data Elements, particularly if linked to accepted data standards and terminologies, continued engagement with and feedback from the research community will remain important.
TL;DR: A model and a repository of radiologic CDEs is described, and three important applications are highlighted, showing how a common data element (CDE) can improve the ability to exchange information seamlessly among information systems.
Abstract: In this article the authors describe an approach to improving automated exchange of radiologic information: the common data element, which defines the attributes and allowable values of a unit of information.
TL;DR: The Network Data Dictionary as discussed by the authors is a device for enabling standardization of data structures in programs, file layouts and Data Base Management System (DBMS) schema residing in Include Files located on one or more computers in a network.
Abstract: The Network Data Dictionary is a device for enabling standardization of data structures in programs, file layouts and Data Base Management System (DBMS) schema residing in Include Files located on one or more computers in a network. By making the data structures comply with the data element definitions stored in a common data element dictionary, improvements in the quality, accuracy, and consistency of data can be obtained, while simultaneously providing productivity advantages to programmers. The device is set up to organize a set of disparate Include Files (representing data structure descriptions corresponding to program structures, file layouts, and DBMS schema), in one or more computers in a network under a common scheme called the Include File Dictionary, so that these Include Files are made accessible by the device to programmers for sharing, controlled modification, and use. The Include Files are enabled by the device so that programmers can edit these with reference to a common data element dictionary (DED) residing on one of the network nodes. Measurement of the extent to which Include Files correspond to dictionary standards are reported on compilation reports so that corrective actions (such as reconciling conflicting data element definitions) can be taken.
TL;DR: An empirical examination of existing databases reveals that designation of data elements, like other terminological products, are subject to the vagaries of polysemy and synonymy and the most practical approach is to compile an open-ended dictionary of common data element types for use as a mapping device during the data preparation stage.
Abstract: Differing theoretical and methodological views and working-group needs have spawned a wide diversity in the content, layout and internal structure of terminological entries in database environments, which in turn complicates standardization and data interchange. Major criticisms lodged against the data element list provided in ISO 6156 (MATER) prompted the authors to conduct an empirical examination of over thirty existing databases to ascertain which data elements are truly used in practice (as opposed to those which are espoused or rejected in theory). Their results reveal that designation of data elements, like other terminological products, are subject to the vagaries of polysemy and synonymy. They conclude that, given the widespread differences in approach evidenced in existing databases, the most practical approach to data element concerns during interchange is to compile an open-ended dictionary of common data element types for use as a mapping device during the data preparation stage.
TL;DR: In this article, the authors present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized data quality assessment framework from Kahn et al.
Abstract: Background Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites. Objectives Major limitations of the former approach included manual interpretation of the results and hard coding of analyses, making their expansion to new data elements and databases time-consuming and error prone. We here present an enhanced version of the DQA tool by linking it to common data element definitions stored in a metadata repository (MDR), adopting the harmonized DQA framework from Kahn et al and its application within the MIRACUM consortium. Methods Data quality checks were consequently aligned to a harmonized DQA terminology. Database-specific information were systematically identified and represented in an MDR. Furthermore, a structured representation of logical relations between data elements was developed to model plausibility-statements in the MDR. Results The MIRACUM DQA tool was linked to data element definitions stored in a consortium-wide MDR. Additional databases used within MIRACUM were linked to the DQ checks by extending the respective data elements in the MDR with the required information. The evaluation of DQ checks was automated. An adaptable software implementation is provided with the R package DQAstats. Conclusion The enhancements of the DQA tool facilitate the future integration of new data elements and make the tool scalable to other databases and data models. It has been provided to all ten MIRACUM partners and was successfully deployed and integrated into their respective data integration center infrastructure.