TL;DR: Integrated detection and localization of concept drifts in process mining with batch and stream trace clustering support. The article proposes an online approach to detect and localize concept drifts using batch and stream trace clustering support. The approach is competitive with baseline concept drift detection methods and allows for the analysis of concept drifts through different process behavior profiles.
Abstract: Process mining can help organizations by extracting knowledge from event logs. However, process mining techniques often assume business processes are stationary, while actual business processes are constantly subject to change because of the complexity of organizations and their external environment. Thus, addressing process changes over time — known as concept drifts — allows for a better understanding of process behavior and can provide a competitive edge for organizations, especially in an online data stream scenario. Current approaches to handling process concept drift focus primarily on detecting and locating concept drifts, often through an integrated, albeit offline, approach. However, part of these integrated approaches rely on complex data structures related to tree-based process models, usually discovered through algorithms whose results are influenced by specific heuristic rules. Moreover, most of the proposed approaches have not been tested on public true concept drift-labeled event logs commonly used as benchmark, making comparative analysis difficult. In this article, we propose an online approach to detect and localize concept drifts in an integrated way using batch and stream trace clustering support. In our approach, cluster models provide input information for both concept drift detection and localization methods. Each cluster abstracts a behavior profile underlying the process and reveals descriptive information about the discovered concept drifts. Experiments with benchmark synthetic event logs with different control-flow changes, as well as with real-world event logs, showed that our approach, when relying on the same clustering model, is competitive in relation to baselines concept drift detection method. In addition, the experiment showed our approach is able to correctly locate the concept drifts detected and allows the analysis of such concept drifts through different process behavior profiles.
Ying Liu, Vinicius Stein Dani, Iris Beerepoot, Xixi Lu
1 Jan 2024
TL;DR: Preprocessing tasks are essential for improving the accuracy and reliability of process mining insights. The identified tasks and their usage in case studies provide a foundation for more structured and transparent preprocessing.
Abstract: Event logs are invaluable for conducting process mining projects, offering insights into process improvement and data-driven decision-making. However, data quality issues affect the correctness and trustworthiness of these insights, making preprocessing tasks a necessity. Despite the recognized importance, the execution of preprocessing tasks remains ad-hoc, lacking support. This paper presents a systematic literature review that establishes a comprehensive repository of preprocessing tasks and their usage in case studies. We identify six high-level and 20 low-level preprocessing tasks in case studies. Log filtering, transformation, and abstraction are commonly used, while log enriching, integration, and reduction are less frequent. These results can be considered a first step in contributing to more structured, transparent event log preprocessing, enhancing process mining reliability.
TL;DR: Relationships between change patterns in dynamic event attributes identify co-occurring change patterns and provide insights into process variations.
Abstract: Process mining utilizes process execution data to discover and analyse business processes. Event logs represent process execution data, providing information about activities executed in a process instance. In addition to generic event attributes like activity and timestamp, events might contain domain-specific attributes, such as a blood sugar measurement in a healthcare environment. Many of these values change during a typical process quite frequently. Hence, we refer to those as dynamic event attributes. Change patterns can be derived from dynamic event attributes, describing if the attribute values change from one activity to another. However, change patterns can only be identified in an isolated manner, neglecting the chance of finding co-occuring change patterns. This paper provides an approach to identify relationships between change patterns. We applied the proposed technique on the MIMIC-IV real-world dataset on hospitalizations in the US and evaluated the results with a medical expert. The approach is implemented in Python using the PM4Py framework.
TL;DR: The proposed method for BP mapping is using job description documents and NLP to identify process elements.
Abstract: Business Process mapping (BP mapping) is important for a company to identify their activities. Previous research suggests several approaches for process identification and BP mapping, which would be easier if the company had already implemented a computer-based information system. The research presented in this paper has the purpose of providing an alternative method for BP mapping especially for the company that does not implement the computer-based information system. A proposed method is using job description documents that the company had to identify elements needed to perform BP mapping which are actor, process, document, and flow of documents. A Natural Language Process (NLP) which is text mining method is used for mining job documents to identify those elements that exist in each job position. To illustrate the applicability of the proposed method, samples of job descriptions of 15 companies are taken. It shows that the proposed method can be applied.
TL;DR: Grouping similar local process models improves process mining accuracy and reduces model explosion and repetition.
Abstract: In recent years, process mining emerged as a proven technology to analyze and improve operational processes. An expanding range of organizations using process mining in their daily operation brings a broader spectrum of processes to be analyzed. Some of these processes are highly unstructured, making it difficult for traditional process discovery approaches to discover a start-to-end model describing the entire process. Therefore, the subdiscipline of Local Process Model (LPM) discovery tries to build a set of LPMs, i.e., smaller models that explain sub-behaviors of the process. However, like other pattern mining approaches, LPM discovery algorithms also face the problems of model explosion and model repetition, i.e., the algorithms may create hundreds if not thousands of models, and subsets of them are close in structure or behavior. This work proposes a three-step pipeline for grouping similar LPMs using various process model similarity measures. We demonstrate the usefulness of grouping through a real-life case study, and analyze the impact of different measures, the gravity of repetition in the discovered LPMs, and how it improves after grouping on multiple real event logs.
TL;DR: This study proposes "Bottleneck Mining", a data-driven approach using process mining to identify manufacturing system bottlenecks via sensitivity analysis on system throughput, improving accuracy over traditional methods that rely on machine metrics.
Abstract: In the era of Industry 4.0, with the availability of production data, there is an increasing demand to detect the throughput bottleneck in manufacturing systems using a data-driven approach. Traditional data-driven approaches identify the bottleneck indirectly by analyzing machine metrics collected from the shop floor, such as machine starvation and blockage. While efficient in most cases, these approaches often fail to identify the bottleneck according to its sensitivity-based definition, potentially resulting in incorrect results. To address this gap, we propose a novel data-driven approach named “Bottleneck Mining”. This approach utilizes event logs as inputs to derive executable simulation models through process mining. Subsequently, it performs sensitivity analysis on the system throughput to accurately locate the bottleneck. The usefulness of the proposed approach has been demonstrated on several production lines.
Abstract: Process mining tools empower process analysts to scrutinize business processes by leveraging algorithmic techniques and event log datasets. To support the analysis of inefficiencies of business processes, different types of visualization techniques have been introduced for process mining. These techniques enhance process models by incorporating performance data, for instance to highlight activity duration by using gradational color palettes, and by mapping statistical parameters as text notes directly into the model. So far, tool vendors have designed a diverse spectrum of visual features for enhancing models, but research has not systematically provided insights into their mutual effectiveness. In this paper, we review the visualizations of six common business process mining tools. To account for the variability in the visual display, we expanded existing taxonomies for evaluating event sequences with marks and channels as well as accessibility dimensions, each important for end-user comprehension. Then, we performed an expert survey to assess the legibility of the visualizations to test the validity of our expanded taxonomy. In this way, we demonstrate the potential for improving process mining visualizations to expand its value in today's process mining tools.
Guillermo Cabrera‐Guerrero, Gustavo Betarte, Juan Diego Campo
12 Aug 2024
TL;DR: This study proposes a process mining-based evaluation methodology for cyber range trainings, analyzing user activities and training processes to assess effectiveness from various perspectives, applied in a real-world training session within a cyber range environment.
Abstract: Cyber ranges are computer systems designed to create realistic cybersecurity scenarios for training purposes. It is essential to have a reliable evaluation process to determine whether users have achieved their objectives. User training involves a sequence of activities that are performed in a specific order to reach a particular goal. This article presents a cyber range implementation and puts forth an evaluation methodology that employs process mining to analyze training processes from different perspectives. The methodology is applied in a training session conducted in the cyber range.
TL;DR: This paper presents an algorithm for discovering hierarchical process models from event logs, using event partitioning to define sub-processes, allowing for concurrency and iterations, and improving readability and understandability of process models in process mining.
Abstract: Process mining is a field of computer science that deals with the discovery and analysis of process models based on automatically generated event logs. Currently, many companies are using this technology to optimize and improve their business processes. However, a discovered process model may be too detailed, sophisticated, and difficult for experts to understand. In this paper, we consider a problem of discovering the hierarchical business process model from a low-level event log, i. e., the problem of the automatic synthesis of more readable and understandable process models based on the data stored in the event logs of information systems. The discovery of better-structured and more readable process models is extensively studied in the framework of process mining research from different perspectives. In this paper, we present an algorithm for discovering hierarchical process models represented as two-level workflow Petri nets. The algorithm is based on predefined event partitioning so that this partitioning defines a sub-process corresponding to a high-level transition at the top level of a two-level net. In contrast to existing solutions, our algorithm does not impose restrictions on the process control flow and allows for concurrency and iterations.
Abstract: Process mining and simulation can be a powerful combination to analyze and improve business processes. While process mining is commonly applied to analyze past process executions (as-is process), simulation allows a user to explore process executions, which might occur in the future (to-be process). Yet, the combination of process mining and simulation is not commonly used in practice. We see two main reasons for this, which we attempt to solve: (1) missing tool support for the creation and execution of process simulations based on event logs, (2) missing guidance for the user-based adaptation of simulation scenarios. Hence, we are introducing a simulation extension to the Mehrwerk ProcessMining software mpmX, which on the one hand enables the automatic discovery and execution of process simulation models based on an event log, while at the same time providing suggestions for the creation of alternative simulation scenarios. The simulation scenarios thereby cover a subset of the discovered process variants from the as-is process and enable a user to analyze the change in process performance based on a more standardized process.
TL;DR: This study applies Process Mining to analyze online application efficiency, identifying bottlenecks and patterns in a private Thai university's process, revealing issues in payment approval and highlighting key touchpoints influencing awareness, with generalizable insights for service improvement.
Abstract: This research aims to analyze the process and efficiency of the online application process using Process Mining techniques. The event log data from 793 online applicants at a private university in Thailand were analyzed using the Disco tool and the Fuzzy Miner algorithm. The results revealed clear patterns in the application process, highlighting issues and ambiguities. For example, in the payment approval process, there was a 36.7 % return rate to the payment confirmation stage, with a waiting time of up to 141 days to receive a final admission decision. The analysis also identified key touchpoints that influence awareness. Educational institutions or businesses can apply Process Mining techniques to identify and improve processes and study user behavior from start to finish, providing generalizable insights on developing strategies for enhancing service efficiency and creating opportunities for revenue growth.
TL;DR: PMiner is a deep learning-based process mining framework for anomaly detection and reconstruction of business processes. It utilizes a deep autoencoder technique to detect anomalies and outperforms state-of-the-art methods.
Abstract: We proposed a deep learning-based process mining framework known as PMiner for automatic detection of anomalies in business processes. Since there are thousands of business processes in real-time applications such as e-commerce, in the presence of concurrency, they are prone to exhibit anomalies. Such anomalies if not detected and rectified, cause severe damage to businesses in the long run. Our Artificial Intelligence (AI) enabled framework PMiner takes business process event longs as input and detects anomalies using a deep autoencoder. The framework exploits a deep autoencoder technique which is well-known for Its ability to discriminate anomalies. We proposed an algorithm known as Intelligent Business Process Anomaly Detector (IBPAD) to realize the framework. This algorithm learns from historical data and performs encoding and decoding procedures to detect business process anomalies automatically. Our empirical results using the BPI Challenge dataset, released by the IEEE Task Force on Process Mining, revealed that PMiner outperforms state-of-the-art methods in detecting business process anomalies. This framework helps businesses to identify process anomalies and rectify them in time to leverage business continuity prospects.
TL;DR: This study proposes a non-rule-based process mining technique using deep learning to detect anomalies in event log traces, improving efficiency and reducing false alarms, and enabling auditors to focus on transactional anomalies.
Abstract: Process mining is an efficient method that can analyze the full population of transactions using the event log of business processes. Conventional rule-based process mining techniques can detect anomalies; however, it tends to trigger a large number of false alarms. To improve the efficiency of anomaly detection using process mining, this study adopts a deep learning-based classification approach to detect anomalies in the traces of event logs. This approach contributes to the literature by proposing a non-rule-based process mining technique based on deep learning. Results demonstrate that the proposed non-rule-based process mining method can help auditors focus on transactional anomalies. Keywords: Process mining, deep learning, anomaly detection, fraudulent activities.
TL;DR: This study integrates decision tree modeling and process mining to develop data-driven strategies for cardiovascular medication management, identifying blood pressure, sodium, and cholesterol levels as key predictors of medication response and highlighting the need for tailored treatment approaches.
Abstract: Effective management of cardiovascular medication is crucial for improving patient outcomes and optimizing treatment strategies. As healthcare systems increasingly rely on data-driven approaches, integrating advanced analytical techniques can enhance medication management. Despite advancements in cardiovascular care, challenges remain in determining the most effective medication for patients based on individual health metrics, highlighting the need for robust methodologies to analyze and interpret large datasets that can inform treatment decisions. This study aims to develop data-driven strategies for managing cardiovascular medication through two distinct techniques: decision tree modeling and process mining. The dataset utilized in this study includes various patient health metrics, such as blood pressure, sodium levels, cholesterol levels, age, and gender, collected from a diverse population. These features are essential for understanding the relationships between health metrics and medication outcomes. The study employed decision tree analysis (both normal and radial styles) to classify patient responses to medications based on health metrics, while process mining techniques were used to analyze age group variations in medication consumption and identify trends within the data. The findings reveal that blood pressure is the most significant predictor of medication response, followed by sodium and cholesterol levels. The results highlight distinct medication preferences across different age groups and blood pressure categories, emphasizing the need for tailored treatment strategies. Based on our results, the presence of high blood pressure and high cholesterol in individuals aged up to 20 years is concerning. The analysis also revealed commonalities between the decision tree and process mining results, particularly in identifying critical health metrics that influence medication management. Both methodologies underscored the importance of physiological factors over demographic variables in predicting medication efficacy. This study offers valuable insights for healthcare practitioners, enabling them to make informed decisions regarding cardiovascular medication management. By leveraging data-driven strategies, practitioners can enhance patient outcomes, reduce adverse effects, and optimize treatment protocols based on individual health metrics.
Van Woensel William, Wang, XiaoYang, Amyot, Daniel
22 Aug 2024
Abstract: Appendix for the paper "Using Process Mining with Pre- and Post-Intervention Analysis to Improve Digital Service Delivery: A Governmental Case Study", published at the 1st Workshop on Empirical Research in Process Mining, co-located with the 6th International Conference on Process Mining (ICPM 2024).
TL;DR: This paper integrates Business Process Management (BPM) with process mining, presenting a framework that leverages data-driven analytics to enhance BPM's lifecycle phases, optimize processes, and improve governance through transparent and data-driven decision-making.
Abstract: Business Process Management (BPM) aims to continuously improve organizational processes to achieve strategic objectives such as operational efficiency, compliance, and agility.However, traditional BPM initia-tives often rely on subjective input, static process models, and assumptions that may not reflect the reality of how processes unfold in practice.Process mining, a data-driven analytical discipline, addresses these limitations by extracting event logs from enterprise information systems and using them to discover, ana-lyze, and improve real-world business processes.This paper explores how BPM can be effectively integrated with process mining techniques.We present a conceptual framework that outlines the synergy between BPM's lifecycle phases and the capabilities of process mining, examine relevant tools and technologies, dis-cuss emerging challenges and opportunities, and provide case studies that illustrate successful integrations.Our findings highlight that process mining serves as a powerful extension of BPM, enabling transparent, data-driven process optimization and governance.
TL;DR: This research proposes a modified genetic process mining algorithm that incorporates timing information to enhance process model performance and robustness by recovering missing events and improving log completeness in incomplete event logs.
Abstract: In process mining, an event log is a structured collection of recorded events that describes the execution of processes within an organization. The completeness of event logs is crucial for ensuring accurate and reliable process models. Incomplete event logs, which can result from system errors, manual data entry mistakes, or irregular operational patterns, undermine the integrity of these models. Addressing this issue is essential for constructing accurate models. This research aims to enhance process model performance and robustness by transforming incomplete event logs into complete ones using a process discovery algorithm. Genetic process mining, a type of process discovery algorithm, is chosen for its ability to evaluate multiple candidate solutions concurrently, effectively recovering missing events and improving log completeness. However, the original form of the genetic process mining algorithm is not optimized for handling incomplete logs, which can result in incorrect models being discovered. To address this limitation, this research proposes a modified approach that incorporates timing information to better manage incomplete logs. By leveraging timing data, the algorithm can infer missing events, leading to process tracking and reconstruction which is more accurate. Experimental results validate the effectiveness of the modified algorithm, showing higher fitness and precision scores, improved process model comparisons, and a good level of coverage without errors. Additionally, several advanced metrics for conformance checking are presented to further validate the process models and event logs discovered by both algorithms.
Abstract: <p>UiPath Task mining is an AI-powered feature that captures the user data performed on the desktop, records the granular level actions, including each mouse click and keystroke, and provides a visualization of the analyzed data captured with the help of Artificial Intelligence. Task Mining also helps users or business analysts identify the bottlenecks in the process and discrepancies and may even contribute to improving the process. This paper explores the usage and features of UiPath Task Mining. This research study also mentionsthe architecture overview of process mining, integrations, anddashboards, among other features. Different types of mining available in Task Mining are also discussedin this paper. Task capture usage, capabilities, and featuresare discussed. The relevance of Process mining related to streamlining business operations, usage of different available templates that are specific and tailored based on the use case, process apps, different fields in the analysis of process graph, and integration to automation hub are discussed in this research paper.</p>
TL;DR: This research proposes a process-aware IoT model using IoT data and process mining techniques, enabling automatic construction of business processes while preserving data richness, and transforming unstructured IoT data into structured data using model-driven architecture and semantic enrichment.
Abstract: The development of suitable systems for managing linked items and data has been made possible by the Internet of Things (IoT), which has completely changed corporate operations. An inventive method called "business process intelligence" combines process mining with process management to enable quick decision-making and real-time analysis. Nevertheless, there are difficulties in converting unstructured Internet of Things data into structured data, as structuring might lead to the loss of crucial information from the original data. In this research, a unique approach to automatically constructing IoT-connected business processes while preserving the original data richness is proposed. In order to transform enormous volumes of unstructured IoT data into data structures appropriate for process mining, the approach uses a model-driven architecture (MDA). The second stage involves semantic enrichment of process mining-generated event logs, which is achieved by annotating event log items with IoT-specific semantic concepts using ontologies.
Abstract: Abstract Business process simulation is a well-known approach to estimate the impact of changes to a process with respect to time and cost measures – a practice known as what-if process analysis. The usefulness of such estimations hinges on the accuracy of the underlying simulation model. Data-Driven Simulation (DDS) methods leverage process mining techniques to learn process simulation models from event logs. Empirical studies have shown that, while DDS models adequately capture the observed sequences of activities and their frequencies, they fail to accurately capture the temporal dynamics of real-life processes. In contrast, generative Deep Learning (DL) models are better able to capture such temporal dynamics. The drawback of DL models is that users cannot alter them for what-if analysis due to their black-box nature. This paper presents a hybrid approach to learn process simulation models from event logs wherein a (stochastic) process model is extracted via DDS techniques, and then combined with a DL model to generate timestamped event sequences. An experimental evaluation shows that the resulting hybrid simulation models match the temporal accuracy of pure DL models, while partially retaining the what-if analysis capability of DDS approaches.