TL;DR: The proposed MeDShare system is blockchain-based and provides data provenance, auditing, and control for shared medical data in cloud repositories among big data entities and employs smart contracts and an access control mechanism to effectively track the behavior of the data.
Abstract: The dissemination of patients’ medical records results in diverse risks to patients’ privacy as malicious activities on these records cause severe damage to the reputation, finances, and so on of all parties related directly or indirectly to the data. Current methods to effectively manage and protect medical records have been proved to be insufficient. In this paper, we propose MeDShare, a system that addresses the issue of medical data sharing among medical big data custodians in a trust-less environment. The system is blockchain-based and provides data provenance, auditing, and control for shared medical data in cloud repositories among big data entities. MeDShare monitors entities that access data for malicious use from a data custodian system. In MeDShare, data transitions and sharing from one entity to the other, along with all actions performed on the MeDShare system, are recorded in a tamper-proof manner. The design employs smart contracts and an access control mechanism to effectively track the behavior of the data and revoke access to offending entities on detection of violation of permissions on data. The performance of MeDShare is comparable to current cutting edge solutions to data sharing among cloud service providers. By implementing MeDShare, cloud service providers and other data guardians will be able to achieve data provenance and auditing while sharing medical data with entities such as research and medical institutions with minimal risk to data privacy.
TL;DR: An overall framework for data governance is provided that can be used by researchers to focus on important data governance issues, and by practitioners to develop an effective data governance approach, strategy and design.
Abstract: IntroductionOrganizations are becoming increasingly serious about the notion of "data as an asset" as they face increasing pressure for reporting a "single version of the truth." In a 2006 survey of 359 North American organizations that had deployed business intelligence and analytic systems, a program for the governance of data was reported to be one of the five success "practices" for deriving business value from data assets. In light of the opportunities to leverage data assets as well ensure legislative compliance to mandates such as the Sarbanes-Oxley (SOX) Act and Basel II, data governance has also recently been given significant prominence in practitioners' conferences, such as TDWI (The Data Warehousing Institute) World Conference and DAMA (Data Management Association) International Symposium.The objective of this article is to provide an overall framework for data governance that can be used by researchers to focus on important data governance issues, and by practitioners to develop an effective data governance approach, strategy and design. Designing data governance requires stepping back from day-to-day decision making and focusing on identifying the fundamental decisions that need to be made and who should be making them. Based on Weill and Ross, we also differentiate between governance and management as follows:• Governance refers to what decisions must be made to ensure effective management and use of IT (decision domains) and who makes the decisions (locus of accountability for decision-making).• Management involves making and implementing decisions.For example, governance includes establishing who in the organization holds decision rights for determining standards for data quality. Management involves determining the actual metrics employed for data quality. Here, we focus on the former.Corporate governance has been defined as a set of relationships between a company's management, its board, its shareholders and other stakeholders that provide a structure for determining organizational objectives and monitoring performance, thereby ensuring that corporate objectives are attained. Considering the synergy between macroeconomic and structural policies, corporate governance is a key element in not only improving economic efficiency and growth, but also enhancing corporate confidence. A framework for linking corporate and IT governance (see Figure 1) has been proposed by Weill and Ross.Unlike these authors, however, we differentiate between IT assets and information assets: IT assets refers to technologies (computers, communication and databases) that help support the automation of well-defined tasks, while information assets (or data) are defined as facts having value or potential value that are documented. Note that in the context of this article, we do not differentiate between data and information.Next, we use the Weill and Ross framework for IT governance as a starting point for our own framework for data governance. We then propose a set of five data decision domains, why they are important, and guidelines for what governance is needed for each decision domain. By operationalizing the locus of accountability of decision making (the "who") for each decision domain, we create a data governance matrix, which can be used by practitioners to design their data governance. The insights presented here have been informed by field research, and address an area that is of growing interest to the information systems (IS) research and practice community.
TL;DR: In this paper, a method for matching data records held by a plurality of data custodians that relate to a particular entity was proposed, where the data records in each cluster are representative of a data record held by each respective data custodian.
Abstract: An aspect of the present invention provides a method for matching data records held by a plurality of data custodians that relate to a particular entity. One such method comprises the steps of receiving a plurality of clusters of data records from each of the plurality of data custodians (310), comparing related data records received from each of the data custodians (320) and determining whether the related data records relate to the entity based on the result of the comparison (330). The data records in each cluster are representative of a data record held by a respective data custodian. Other aspects of the present invention provide systems and computer program products that embody the methods of the present invention.
TL;DR: Although it received more attention in corporate settings and some of the skills related to it are already possessed by librarians, knowledge on data governance is foundational for research data services, especially as it appears on all levels of research data Services, and is applicable to big data.
Abstract: Data governance and data literacy are two important building blocks in the knowledge base of information professionals involved in supporting data-intensive research, and both address data quality
TL;DR: A data sharing framework that will guarantee the authenticity of the shared data in real-time and provide transactional privacy in a blockchain network is proposed that can significantly reduce the turnaround time for data sharing, improve the decision making process and reduce the overall cost.
Abstract: Personal data such as electronic medical records and academic records are critical and sensitive private information These personal information is usually hosted across many data-custodian systems Personal Data Store (PDS) is a service that lets an individual store, manage and deploy their key personal data in a highly secure and structured way It also gives the user a central point of control for their personal information One of the inherent problems of digital records is that it can be easily forged Therefore, the data-consumer(with whom the data is shared) often needs to verify the authenticity of the shared document/record by communicating with the document/certificate issuing authority (eg, data custodian) However, this process is time consuming and inefficient In recent time, blockchain has gained tremendous attention from both industry and academia for distributed recording and immutable transactions Blockchain provides a shared, immutable and transparent history of transactions enabling the building of applications that incorporate trust, accountability and transparency This provides a unique opportunity to develop a secure and trustable data sharing system using blockchain However, blockchain is primarily proposed for publicly verifiable transactions and does not provide privacy to the individuals In this paper, we propose a data sharing framework that will guarantee the authenticity of the shared data in real-time and provide transactional privacy in a blockchain network We have implemented our framework in a prototype that ensures privacy, integrity, and fine-grained access control over the shared data The proposed work can significantly reduce the turnaround time for data sharing, improve the decision making process and reduce the overall cost