TL;DR: A data-driven digital twin framework integrating with BIM, IoT, and data mining for advanced project management, which can facilitate data communication and exploration to better understand, predict, and optimize the physical construction operations is built.
TL;DR: This paper presents a literature study, in which the state-of-the-art in the application of such event abstraction techniques in the field of process mining are assessed, accompanied by a taxonomy of the existing approaches, which are exploited to highlight interesting novel directions.
Abstract: The execution of processes in companies generates traces of event data, stored in the underlying information system(s), capturing the actual execution of the process. Analyzing event data, i.e., the focus of process mining, yields a detailed understanding of the process, e.g., we are able to discover the control flow of the process and detect compliance and performance issues. Most process mining techniques assume that the event data are of the same and/or appropriate level of granularity. However, in practice, the data are extracted from different systems, e.g., systems for customer relationship management, Enterprise Resource Planning, etc., record the events at different granularity levels. Hence, pre-processing techniques that allow us to abstract event data into the right level of granularity are vital for the successful application of process mining. In this paper, we present a literature study, in which we assess the state-of-the-art in the application of such event abstraction techniques in the field of process mining. The survey is accompanied by a taxonomy of the existing approaches, which we exploit to highlight interesting novel directions.
TL;DR: This paper attempts to synthesise the advantages and disadvantages of the procedural decisions in these approaches by conducting a systematic literature review of process prediction approaches.
Abstract: Process mining enables the reconstruction and evaluation of business processes based on digital traces in IT systems. An increasingly important technique in this context is process prediction. Given a sequence of events of an ongoing trace, process prediction allows forecasting upcoming events or performance measurements. In recent years, multiple process prediction approaches have been proposed, applying different data processing schemes and prediction algorithms. This study focuses on deep learning algorithms since they seem to outperform their machine learning alternatives consistently. Whilst having a common learning algorithm, they use different data preprocessing techniques, implement a variety of network topologies and focus on various goals such as outcome prediction, time prediction or control-flow prediction. Additionally, the set of log-data, evaluation metrics and baselines used by the authors diverge, making the results hard to compare. This paper attempts to synthesise the advantages and disadvantages of the procedural decisions in these approaches by conducting a systematic literature review.
TL;DR: In this paper, the development of the BPM discipline over the years is characterized by the following themes: BPM Systems, process modeling, process design, coordination and interoperability, model management, process mining, and new technologies.
TL;DR: The proposed method to automatically discover manufacturing systems and generate adequate digital twins is applied and the experimental results prove its effectiveness in generating digital models that can correctly estimate the system performance.
TL;DR: OCEL as mentioned in this paper is a generic and scalable format for the storage of object-centric event logs, where each event can be related to different objects and can be exploited by a new set of process mining techniques.
Abstract: The application of process mining techniques to real-life information systems is often challenging. Considering a Purchase to Pay (P2P) process, several case notions such as order and item are involved, interacting with each other. Therefore, creating an event log where events need to relate to a single case (i.e., process instance) leads to convergence (i.e., the duplication of an event related to different cases) and divergence (i.e., the inability to separate events within the same case) problems. To avoid such problems, object-centric event logs have been proposed, where each event can be related to different objects. These can be exploited by a new set of process mining techniques. This paper describes OCEL (Object-Centric Event Log), a generic and scalable format for the storage of object-centric event logs. The implementation of the format can use either JSON or XML, and tool support is provided.
TL;DR: A fuzzy MDCM approach based on spherical fuzzy AHP is offered in this study to manage the problem of selecting process mining technology under uncertain and ambiguous conditions and one-at-a-time sensitivity analysis is applied to reduce the decision-makers’ subjectivity.
Abstract: Process mining (PM) supports organizations by improving their processes using event log data collected from information technology systems. Its primary purposes are discovering actual process models, monitoring and comparing actual and desired workflows, and enhancing processes by considering the discovered model and desired flow. Because process mining gains attraction day by day, various technology companies developed process mining tools to support organizations managing their business processes with data science. Technology selection is a complicated multi-criteria decision-making (MDCM) problem under several criteria and experts’ evaluation, including uncertainty and subjectivity. Spherical fuzzy set is a powerful concept to cope with uncertainty by presenting a wider decision-making area and identifying hesitancy. A fuzzy MDCM approach based on spherical fuzzy AHP is offered in this study to manage the problem of selecting process mining technology under uncertain and ambiguous conditions. Then, one-at-a-time sensitivity analysis is applied to reduce the decision-makers’ subjectivity. This study results in that Price, Process Discovery, Process Analysis&Analytics are the most relevant criteria to decide PM technology. It is interesting that even although Operational Support is one of the less important criteria, it may change the decision on selecting the best PM technology.
TL;DR: In this article, the authors conducted a systematic literature re-evaluation of the concept drift in process mining and found that it is a challenge as classical methods assume processes are in a steady-state and events share the same process version.
Abstract: Concept drift in process mining (PM) is a challenge as classical methods assume processes are in a steady-state, i.e., events share the same process version. We conducted a systematic literature re...
TL;DR: This paper defines conformance for stochastic process models taking into account frequencies and routing probabilities, and extends the so-called ‘reallocation matrix’ to consider paths to enable detailed diagnostics projected on both model and log.
TL;DR: This paper reviewed management-oriented literature on process mining and business management to assess the state-of-the-art and to pave the way for further research, and identified eleven research gaps sorted into four categories.
TL;DR: This study proposes a novel local post-hoc explanation approach for a deep learning classifier that is expected to facilitate the domain experts in justifying the model decisions and defines the local regions from the validation dataset by using the intermediate latent space representations learned by the deep neural networks.
Abstract: The contemporary process-aware information systems possess the capabilities to record the activities generated during the process execution. To leverage these process specific fine-granular data, process mining has recently emerged as a promising research discipline. As an important branch of process mining, predictive business process management, pursues the objective to generate forward-looking, predictive insights to shape business processes. In this study, we propose a conceptual framework sought to establish and promote understanding of decision-making environment, underlying business processes and nature of the user characteristics for developing explainable business process prediction solutions. Consequently, with regard to the theoretical and practical implications of the framework, this study proposes a novel local post-hoc explanation approach for a deep learning classifier that is expected to facilitate the domain experts in justifying the model decisions. In contrary to alternative popular perturbation-based local explanation approaches, this study defines the local regions from the validation dataset by using the intermediate latent space representations learned by the deep neural networks. To validate the applicability of the proposed explanation method, the real-life process log data delivered by the Volvo IT Belgium’s incident management system are used. The adopted deep learning classifier achieves a good performance with the area under the ROC Curve of 0.94. The generated local explanations are also visualized and presented with relevant evaluation measures which are expected to increase the users’ trust in the black-box model.
TL;DR: In this article, a general data model for multi-dimensional event data based on labeled property graphs is proposed, which allows storing structural and temporal relations in a single, integrated graph-based data structure in a systematic way.
Abstract: Process event data is usually stored either in a sequential process event log or in a relational database. While the sequential, single-dimensional nature of event logs aids querying for (sub)sequences of events based on temporal relations such as “directly/eventually-follows,” it does not support querying multi-dimensional event data of multiple related entities. Relational databases allow storing multi-dimensional event data, but existing query languages do not support querying for sequences or paths of events in terms of temporal relations. In this paper, we propose a general data model for multi-dimensional event data based on labeled property graphs that allows storing structural and temporal relations in a single, integrated graph-based data structure in a systematic way. We provide semantics for all concepts of our data model, and generic queries for modeling event data over multiple entities that interact synchronously and asynchronously. The queries allow for efficiently converting large real-life event data sets into our data model, and we provide 5 converted data sets for further research. We show that typical and advanced queries for retrieving and aggregating such multi-dimensional event data can be formulated and executed efficiently in the existing query language Cypher, giving rise to several new research questions. Specifically, aggregation queries on our data model enable process mining over multiple inter-related entities using off-the-shelf technology.
TL;DR: In this paper, a data-driven procedure to improve productivity in make-to-stock manufacturing is proposed, by leveraging recent developments in information systems research, the paper addresses manufactur...
Abstract: This paper proposes a data-driven procedure to improve productivity in make-to-stock manufacturing. By leveraging recent developments in information systems research, the paper addresses manufactur...
TL;DR: In this article, the authors provide a holistic view of opportunities and challenges for process mining in organizations identified in a Delphi study with 40 international experts from academia and industry, and propose a set of 30 opportunities and 32 challenges, as well as differences in the perceived relevance between academics and practitioners.
Abstract: Process mining is an active research domain and has been applied to understand and improve business processes. While significant research has been conducted on the development and improvement of algorithms, evidence on the application of process mining in organizations has been far more limited. In particular, there is limited understanding of the opportunities and challenges of using process mining in organizations. Such an understanding has the potential to guide research by highlighting barriers for process mining adoption and, thus, can contribute to successful process mining initiatives in practice. In this respect, the paper provides a holistic view of opportunities and challenges for process mining in organizations identified in a Delphi study with 40 international experts from academia and industry. Besides proposing a set of 30 opportunities and 32 challenges, the paper conveys insights into the comparative relevance of individual items, as well as differences in the perceived relevance between academics and practitioners. Therefore, the study contributes to the future development of process mining, both as a research field and regarding its application in organizations.
TL;DR: This paper proposes a novel predictive process approach that couples multi-view learning and deep learning, in order to gain predictive accuracy by accounting for the variety of information possibly recorded in event logs.
Abstract: The predictive business process monitoring is a family of online approaches to predict the unfolding of running traces basedon the knowledge learned from historical event logs. In this paper, we address the task of predicting the next trace activity from the completed events in a running trace. This is an important business capability as counting on accurate predictions of the future activities may allow companies to guarantee the higher utilization by acting proactively in anticipation. We propose a novel predictive process approach that couples multi-view learning and deep learning, in order to gain predictive accuracy by accounting for the variety ofinformation possibly recorded in event logs. Experiments with various benchmark event logs prove the effectiveness of the proposed approach compared to several recent state-of-the-art methods.
TL;DR: A survey of relevant approaches of event data preprocessing for business process mining tasks is presented in this article, where the authors present a quantitative and qualitative analysis of the most popular techniques for event log preprocessing.
Abstract: Process Mining allows organizations to obtain actual business process models from event logs (discovery), to compare the event log or the resulting process model in the discovery task with the existing reference model of the same process (conformance), and to detect issues in the executed process to improve (enhancement). An essential element in the three tasks of process mining (discovery, conformance, and enhancement) is data cleaning, used to reduce the complexity inherent to real-world event data, to be easily interpreted, manipulated, and processed in process mining tasks. Thus, new techniques and algorithms for event data preprocessing have been of interest in the research community in business process. In this paper, we conduct a systematic literature review and provide, for the first time, a survey of relevant approaches of event data preprocessing for business process mining tasks. The aim of this work is to construct a categorization of techniques or methods related to event data preprocessing and to identify relevant challenges around these techniques. We present a quantitative and qualitative analysis of the most popular techniques for event log preprocessing. We also study and present findings about how a preprocessing technique can improve a process mining task. We also discuss the emerging future challenges in the domain of data preprocessing, in the context of process mining. The results of this study reveal that the preprocessing techniques in process mining have demonstrated a high impact on the performance of the process mining tasks. The data cleaning requirements are dependent on the characteristics of the event logs (voluminous, a high variability in the set of traces size, changes in the duration of the activities. In this scenario, most of the surveyed works use more than a single preprocessing technique to improve the quality of the event log. Trace-clustering and trace/event level filtering resulted in being the most commonly used preprocessing techniques due to easy of implementation, and they adequately manage noise and incompleteness in the event logs.
TL;DR: The results of the evaluations indicate that PMSS is useful as a guideline to support Six Sigma-based process improvement activities, and contributes to the broad field of quality management.
Abstract: Process mining offers a set of techniques for gaining data-based insights into business processes from event logs. The literature acknowledges the potential benefits of using process mining techniques in Six Sigma-based process improvement initiatives. However, a guideline that is explicitly dedicated on how process mining can be systematically used in Six Sigma initiatives is lacking. To address this gap, the Process Mining for Six Sigma (PMSS) guideline has been developed to support organizations in systematically using process mining techniques aligned with the DMAIC (Define-Measure-Analyze-Improve-Control) model of Six Sigma. Following a design science research methodology, PMSS and its tool support have been developed iteratively in close collaboration with experts in Six Sigma and process mining, and evaluated by means of focus groups, demonstrations and interviews with industry experts. The results of the evaluations indicate that PMSS is useful as a guideline to support Six Sigma-based process improvement activities. It offers a structured guideline for practitioners by extending the DMAIC-based standard operating procedure. PMSS can help increasing the efficiency and effectiveness of Six Sigma-based process improving efforts. This work extends the body of knowledge in the fields of process mining and Six Sigma, and helps closing the gap between them. Hence, it contributes to the broad field of quality management.
TL;DR: This paper provides formal definitions of attack models and introduces an effective group-based privacy preservation technique for process mining that covers the main perspectives of process mining including control-flow, time, case, and organizational perspectives.
Abstract: Process mining techniques help to improve processes using event data. Such data are widely available in information systems. However, they often contain highly sensitive information. For example, healthcare information systems record event data that can be utilized by process mining techniques to improve the treatment process, reduce patient’s waiting times, improve resource productivity, etc. However, the recorded event data include highly sensitive information related to treatment activities. Responsible process mining should provide insights about the underlying processes, yet, at the same time, it should not reveal sensitive information. In this paper, we discuss the challenges regarding directly applying existing well-known group-based privacy preservation techniques, e.g., k -anonymity, l -diversity, etc, to event data. We provide formal definitions of attack models and introduce an effective group-based privacy preservation technique for process mining. Our technique covers the main perspectives of process mining including control-flow, time, case, and organizational perspectives. The proposed technique provides interpretable and adjustable parameters to handle different privacy aspects. We employ real-life event data and evaluate both data utility and result utility to show the effectiveness of the privacy preservation technique. We also compare this approach with other group-based approaches for privacy-preserving event data publishing.
TL;DR: In this paper, the authors presented a study that analyzed four process mining algorithms that are most commonly used in the self-regulated learning literature: Inductive Miner, Heuristics Miner, Fuzzy Miner, and pMineR.
Abstract: The conceptualisation of self-regulated learning (SRL) as a process that unfolds over time has influenced the way in which researchers approach analysis. This gave rise to the use of process mining in contemporary SRL research to analyse data about temporal and sequential relations of processes that occur in SRL. However, little attention has been paid to the choice and combinations of process mining algorithms to achieve the nuanced needs of SRL research. We present a study that 1) analysed four process mining algorithms that are most commonly used in the SRL literature – Inductive Miner, Heuristics Miner, Fuzzy Miner, and pMineR; and 2) examined how the metrics produced by the four algorithms complement each. The study looked at micro-level processes that were extracted from trace data collected in an undergraduate course (N=726). The study found that Fuzzy Miner and pMineR offered better insights into SRL than the other two algorithms. The study also found that a combination of metrics produced by several algorithms improved interpretation of temporal and sequential relations between SRL processes. Thus, it is recommended that future studies of SRL combine the use of process mining algorithms and work on new tools and algorithms specifically created for SRL research.
TL;DR: It turns out that the participant playing a central role in the network tends to overburden with heavier workloads, leading to more undesirable discrepancies and delays, and extensive investigations based on process mining supports data-driven decision making to strategically smooth the construction process and increase collaboration opportunities.
TL;DR: In this article, the authors provide a foundation for future research on privacy-preserving and confidential process mining techniques and identify main threats related to an motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining.
Abstract: Privacy and confidentiality are very important prerequisites for applying process mining in order to comply with regulations and keep company secrets. This paper provides a foundation for future research on privacy-preserving and confidential process mining techniques. Main threats are identified and related to an motivation application scenario in a hospital context as well as to the current body of work on privacy and confidentiality in process mining. A newly developed conceptual model structures the discussion that existing techniques leave room for improvement. This results in a number of important research challenges that should be addressed by future process mining research.
TL;DR: In this article, the authors present a design science research project in which a method for the integration of Process Mining (PM) technology using big data analytics promises valuable support for 6S and its data analysis capabilities.
TL;DR: The CoCoMoT (Computing Conformance Modulo Theories) framework as mentioned in this paper is based on data Petri nets (DPNs) as the underlying reference formalism for computing conformance metrics and data-aware alignments.
Abstract: Conformance checking is a key process mining task for comparing the expected behavior captured in a process model and the actual behavior recorded in a log. While this problem has been extensively studied for pure control-flow processes, conformance checking with multi-perspective processes is still at its infancy. In this paper, we attack this challenging problem by considering processes that combine the data and control-flow dimensions. In particular, we adopt data Petri nets (DPNs) as the underlying reference formalism, and show how solid, well-established automated reasoning techniques can be effectively employed for computing conformance metrics and data-aware alignments. We do so by introducing the CoCoMoT (Computing Conformance Modulo Theories) framework, with a fourfold contribution. First, we show how SAT-based encodings studied in the pure control-flow setting can be lifted to our data-aware case, using SMT as the underlying formal and algorithmic framework. Second, we introduce a novel preprocessing technique based on a notion of property-preserving clustering, to speed up the computation of conformance checking outputs. Third, we provide a proof-of-concept implementation that uses a state-of-the-art SMT solver and report on preliminary experiments. Finally, we discuss how CoCoMoT directly lends itself to a number of further tasks, like multi- and anti-alignments, log analysis by clustering, and model repair.
TL;DR: A novel algorithm is presented which can also detect parallel, sequential and concurrent batching over several connected tasks, i.e., subprocesses, and shows that batch processing at the subprocess level can be reliably detected.
TL;DR: In this paper, the authors employ a suite of state-of-the-art algorithms, from the online analytics processing, data mining, and process mining domains, to present an alternative human-in-theloop AI method to enable educators to identify, explore, and use appropriate interventions for subpopulations of students with the highest deviation in performance or learning process compared to the rest of the class.
Abstract: Learning analytics dashboards commonly visualize data about students with the aim of helping students and educators understand and make informed decisions about the learning process. To assist with making sense of complex and multidimensional data, many learning analytics systems and dashboards have relied strongly on AI algorithms based on predictive analytics. While predictive models have been successful in many domains, there is an increasing realization of the inadequacies of using predictive models in decision-making tasks that affect individuals without human oversight. In this paper, we employ a suite of state-of-the-art algorithms, from the online analytics processing, data mining, and process mining domains, to present an alternative human-in-the-loop AI method to enable educators to identify, explore, and use appropriate interventions for subpopulations of students with the highest deviation in performance or learning process compared to the rest of the class. We demonstrate an application of our proposed approach in an existing learning analytics dashboard (LAD) and explore the recommended drill-downs in a course with 875 students. The demonstration provides an example of the recommendations from real course data and shows how recommendations can lead the user to interesting insights. Furthermore, we demonstrate how our approach can be employed to develop intelligent LADs.
TL;DR: In this article, a three-phase framework is proposed to leverage hospital tracking data of patient visits while designing healthcare layouts with pod structures, which is validated using a case study for a renovation project of a large heart and vascular clinic in the US.
Abstract: This paper proposes a three-phase framework to leverage hospital tracking data of patient visits while designing healthcare layouts with pod structures. The first phase proposes a process mining algorithm that modifies the Probabilistic Determining Finite Automata (PDFA) with Particle Swarm Optimization (PDFA-PSO) algorithm to predict the significant patient workflows from hospital historical data. The second phase employs simulation modeling to solve a right-sizing problem to determine the optimal size of the layout pods and the frequency of flows between the different clinical locations. The final phase uses an Unequal Area Facility Layout Problem (UAFLP) to determine the layout typology. The proposed process mining and simulation model are vital steps to measure the frequency between spaces and pod areas, which are needed to solve the UAFLP for outpatient settings. The proposed framework is validated using a case study for a renovation project of a large heart and vascular clinic in the US. The research shows that process mining is an efficient tool to extract a subset of significant patient pathways among 90 pathway variants and build a more realistic simulation that reflects behavioral and operational aspects. The research shows that the PSO algorithm is efficient in estimating the PDFA parameters and improving the prediction accuracy of the extracted patient pathways. In addition, the research shows that Genetic Algorithm with Placement Staretegy is an efficient algorithm for layout automation.
TL;DR: This work defines a taxonomy of uncertain event logs and models, and examines the challenges that uncertainty poses on process discovery and conformance checking, and shows how upper and lower bounds for conformance can be obtained by aligning an uncertain trace onto a regular process model.
TL;DR: In this paper, the authors introduce a notion for the precision and fitness of an object-centric Petri net with respect to an objectcentric event log, and provide an algorithm to calculate these quality measures.
Abstract: Traditional process mining considers only one single case notion and discovers and analyzes models based on this. However, a single case notion is often not a realistic assumption in practice. Multiple case notions might interact and influence each other in a process. Object-centric process mining introduces the techniques and concepts to handle multiple case notions. So far, such event logs have been standardized and novel process model discovery techniques were proposed. However, notions for evaluating the quality of a model are missing. These are necessary to enable future research on improving object-centric discovery and providing an objective evaluation of model quality. In this paper, we introduce a notion for the precision and fitness of an object-centric Petri net with respect to an object-centric event log. We give a formal definition and accompany this with an example. Furthermore, we provide an algorithm to calculate these quality measures. We discuss our precision and fitness notion based on an event log with different models. Our precision and fitness notions are an appropriate way to generalize quality measures to the object-centric setting since we are able to consider multiple case notions, their dependencies and their interactions.
TL;DR: A new approach for completion time prediction and performance analysis in manufacturing, considering the individual behavior of process activities, is presented, addressing process mining techniques to support the development of a probabilistic model in Bayesian Networks and predictive models.