About: Master data is a research topic. Over the lifetime, 1224 publications have been published within this topic receiving 11228 citations. The topic is also known as: masterdata & stamdata.
TL;DR: RapidMiner: Data Mining Use Cases and Business Analytics Applications provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors.
Abstract: Powerful, Flexible Tools for a Data-Driven WorldAs the data deluge continues in todays world, the need to master data mining, predictive analytics, and business analytics has never been greater. These techniques and tools provide unprecedented insights into data, enabling better decision making and forecasting, and ultimately the solution of increasingly complex problems. Learn from the Creators of the RapidMiner Software Written by leaders in the data mining community, including the developers of the RapidMiner software, RapidMiner: Data Mining Use Cases and Business Analytics Applications provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors. It presents the most powerful and flexible open source software solutions: RapidMiner and RapidAnalytics. The software and their extensions can be freely downloaded at www.RapidMiner.com. Understand Each Stage of the Data Mining ProcessThe book and software tools cover all relevant steps of the data mining process, from data loading, transformation, integration, aggregation, and visualization to automated feature selection, automated parameter and process optimization, and integration with other tools, such as R packages or your IT infrastructure via web services. The book and software also extensively discuss the analysis of unstructured data, including text and image mining. Easily Implement Analytics Approaches Using RapidMiner and RapidAnalytics Each chapter describes an application, how to approach it with data mining methods, and how to implement it with RapidMiner and RapidAnalytics. These application-oriented chapters give you not only the necessary analytics to solve problems and tasks, but also reproducible, step-by-step descriptions of using RapidMiner and RapidAnalytics. The case studies serve as blueprints for your own data mining applications, enabling you to effectively solve similar problems.
TL;DR: MapReduce, in conjunction with the Hadoop Distributed File System (HDFS) and HBase database, as part of the Apache Hadoops project is a modern approach to analyze unstructured data.
Abstract: The amount of data in our industry and the world is exploding. Data is being collected and stored at unprecedented rates. The challenge is not only to store and manage the vast volume of data (“big data”), but also to analyze and extract meaningful value from it. There are several approaches to collecting, storing, processing, and analyzing big data. The main focus of the paper is on unstructured data analysis. Unstructured data refers to information that either does not have a pre-defined data model or does not fit well into relational tables. Unstructured data is the fastest growing type of data, some example could be imagery, sensors, telemetry, video, documents, log files, and email data files. There are several techniques to address this problem space of unstructured analytics. The techniques share a common character tics of scale-out, elasticity and high availability. MapReduce, in conjunction with the Hadoop Distributed File System (HDFS) and HBase database, as part of the Apache Hadoop project is a modern approach to analyze unstructured data. Hadoop clusters are an effective means of processing massive volumes of data, and can be improved with the right architectural approach.
TL;DR: In this paper, a method and system for automatically resolving data conflicts in a shared data environment where a plurality of users can concurrently access at least portions of a master data file is presented.
Abstract: A method and system for automatically resolving data conflicts in a shared data environment where a plurality of users can concurrently access at least portions of a master data file is presented. Users process data files by means of local copies of a master data file. When an attempted update of a master data file with an edited data file from a user is detected, the updating file is analyzed to determine if any changes made are in conflict with changes made to the master data file by a second user. If a conflict is detected, it is resolved by merging the updating file into the master file according to a predefined set of rules. For conflicts which are not resolved by rule-based reconciliation, at least one user is notified of the conflict and presented with conflict resolving information and the conflict is resolved according to user input.
TL;DR: In this article, a system for specifying, ordering, and building a build-to-order computer system is presented, where a user can specify, order, and build a computer system over the Internet.
Abstract: A system for specifying, ordering, and building a build-to-order computer system. After initiating an ordering session, a user such as a purchaser or designer is presented with a list of options such as a list of operating systems offered by a computer system vendor or manufacturer that may be implemented on a targeted computer system. After receiving an indication of a selection from a first list of options, the system accesses a computer system readable master data base to generate a second list of options such as software programs wherein each option of the second list is compatible with the selection from the first list. The master data base includes entries for every option offered by the computer system vendor or manufacturer and includes at least one tag indicating compatibly with other entries in the master data base. The system can be used to present to the user a plurality of lists wherein all of the options presented are compatible with the previous selections. The system writes indications of the selections in a data file. The data file is provided to manufacturing wherein the selections are implemented on a targeted computer system using the data file. The system may also include a sniffing feature used to determine particular hardware parameters of the targeted computer system. The system uses the determined parameters in generating the compatible lists of options. The system enables a purchaser to buy and order a computer system over a computer network such as the Internet.
TL;DR: In this paper, a transactional data communications system and method is proposed to communicate information within an enterprise having a process control system and a plurality of information technology systems that are communicatively coupled to the process control systems via a web services interface and a transactual information server.
Abstract: A transactional data communications system and method communicates information within an enterprise having a process control system and a plurality of information technology systems that are communicatively coupled to the process control system via a web services interface and a transactional information server. The system and method generates transactional process control information and formats the transactional process control information based on an extensible markup language input schema to form formatted transactional process control information. The system and method sends the formatted transactional process control information to the transactional information server via the web services interface and maps the formatted transactional process control information to an extensible markup language output schema associated with one of the plurality of information technology systems to form mapped transactional process control information. The system and method then sends the mapped transactional process control information to the one of the plurality of information technology systems.