TL;DR: For smaller target data sets freezing the weights for the initial layers of the network gives better results on the target set classes, and presents a simple and easy to implement training heuristic based on these findings.
Abstract: In this paper we study the effect of target set size on transfer learning in deep learning convolutional neural networks. This is an important problem as labelling is a costly task, or for new or specific classes the number of labelled instances available may simply be too small. We present results for a series of experiments where we either train on a target of classes from scratch, retrain all layers, or subsequently lock more layers in the network, for the Tiny-ImageNet and MiniPlaces2 data sets. Our findings indicate that for smaller target data sets freezing the weights for the initial layers of the network gives better results on the target set classes. We present a simple and easy to implement training heuristic based on these findings.
TL;DR: A novel tree structure, namely SIQ- Tree (Sum of Item Quantities), which captures database information through a single-pass is proposed, which outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.
Abstract: In frequent pattern mining, items are considered as having the same importance in a database and their occurrence are represented as binary values in transactions. In real-world databases, however, items not only have relative importance but also are represented as non-binary values in transactions. High utility pattern mining is one of the most essential issues in the pattern mining field, which recently emerged to address the limitation of frequent pattern mining. Meanwhile, tree construction with a single database scan is significant since a database scan is a time-consuming task. In utility mining, an additional database scan is necessary to identify actual high utility patterns from candidates. In this paper, we propose a novel tree structure, namely SIQ- Tree (Sum of Item Quantities), which captures database information through a single-pass. Moreover, a restructuring method is suggested with strategies for reducing overestimated utilities. The proposed algorithm can construct the SIQ-Tree with only a single scan and decrease the number of candidate patterns effectively with the reduced overestimation utilities, through which mining performance is improved. Experimental results show that our algorithm outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.
TL;DR: The problem of variable selection for linear and nonlinear regression is deeply investigated and the curse of dimensionality issue is addressed and it is found that RF models are more efficient for selecting variables especially when used with an external score of importance.
Abstract: Variable selection is crucial for improving interpretation quality and forecasting accuracy. To this end, it is very interesting to choose an effective dimension reduction technique suitable for processing data according to their specificity and characteristics. In this paper, the problem of variable selection for linear and nonlinear regression is deeply investigated. The curse of dimensionality issue is also addressed. An intensive comparative study is performed between Support Vector Regression (SVR) and Random Forests (RF) for the purpose of variable importance assessment then for variable selection. The main contribution of this work is twofold: to expose some experimental insights about the efficiency of variable ranking and selection based on SVR and on RF, and to provide a benchmark study that helps researchers to choose the appropriate method for their data. Experiments on simulated and real-world datasets have been carried out. Results show that the SVR score ∂Gα is recommended for variable ranking in linear situations whereas the RF score is preferable in nonlinear cases. Moreover, we found that RF models are more efficient for selecting variables especially when used with an external score of importance.
TL;DR: This paper presents solutions to the IDA 2016 Industrial Challenge which consists of using machine learning to predict whether a specific component of the Air Pressure System of a vehicle faces imminent failure and evaluates various state-of-the-art classification algorithms.
Abstract: This paper presents solutions to the IDA 2016 Industrial Challenge which consists of using machine learning in order to predict whether a specific component of the Air Pressure System of a vehicle faces imminent failure. This problem is modelled as a classification problem, since the goal is to determine if an unobserved instance represents a failure or not. We evaluate various state-of-the-art classification algorithms and investigate how to deal with the imbalanced dataset and with the high amount of missing data. Our experiments showed that the best classifier was cost-wise 92.56 % better than a baseline solution where a random classification is performed.
TL;DR: This paper proposes multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining, and proposes a new method to discover interesting association rules without favoring or excluding any measure.
Abstract: Association rule mining meeting a variety of measures is regarded as a multi-objective optimization problem rather than a single objective optimization problem. The convergent speed of traditional multi-objective algorithms such as genetic algorithm is slow and the efficiency of these algorithms is low. Furthermore, the rules generated by traditional multi-objective algorithms are too large to be efficiently analyzed and explored in any further process. Bat algorithm is a new efficient global optimal algorithm whose convergence is superior to binary particle swarm optimization (BPSO) and genetic algorithm. This paper discusses the application of multi-objective bat algorithm to association rule mining. We propose multi-objective binary bat algorithm (MBBA) based on Pareto for association rule mining. This algorithm is independent of minimum support and minimum confidence. To evaluate the association rules mined by MBBA algorithm, we propose a new method to discover interesting association rules without favoring or excluding any measure. Compared with the single-objective BPSO, binary bat algorithm (BBA) and Apriori algorithm, the experimental results on six datasets show that the new algorithm is feasible and highly effective. It can make up the shortage of single objective algorithms and traditional association rule mining algorithms.
TL;DR: A new hiding method based on evolutionary multi-objective optimization (EMO) is proposed and the side effects generated by the hiding process are formulated as optimization goals.
Abstract:
Today, people benefit from utilizing data mining technologies, such as association rule mining methods, to find valuable knowledge residing in a large amount of data. However, they also face the risk of exposing sensitive or confidential information, when data is shared among different organizations. Thus, a question arise: how can we prevent that sensitive knowledge is discovered, while ensuring that ordinary non-sensitive knowledge can be mined to the maximum extent possible. In this paper, we address the problem of privacy preserving in association rule mining from the perspective of multi-objective optimization. A new hiding method based evolutionary multi-objective optimization (EMO) is proposed and the side effects generated by the hiding process are formulated as optimization goals. EMO is used to find candidate transactions to modify so that side effects are minimized. Comparative experiments with exact methods on real datasets demonstrated that the proposed method can hide sensitive rules with fewer side effects.
TL;DR: Results show that the metafeatures engineering and the biased sampling method are critical for improving the performance of the classifier.
Abstract: We describe a data mining workflow for predictive maintenance of the Air Pressure System in heavy trucks. Our approach is composed by four steps: (i) a filter that excludes a subset of features and examples based on the number of missing values (ii) a metafeatures engineering procedure used to create a meta-level features set with the goal of increasing the information on the original data; (iii) a biased sampling method to deal with the class imbalance problem; and (iv) boosted trees to learn the target concept. Results show that the metafeatures engineering and the biased sampling method are critical for improving the performance of the classifier.
TL;DR: This paper introduces The Morality Machine, a system that tracks ethical sentiment in Twitter discussions based on Moral Foundations Theory, a framework of moral values that are assumed to be universal.
Abstract: This paper introduces The Morality Machine, a system that tracks ethical sentiment in Twitter discussions. Empirical approaches to ethics are rare, and to our knowledge this system is the first to take a machine learning approach. It is based on Moral Foundations Theory, a framework of moral values that are assumed to be universal. Carefully handcrafted keyword dictionaries for Moral Foundations Theory exist, but experiments demonstrate that models that do not leverage these have similar or superior performance, thus proving the value of a more pure machine learning approach.
TL;DR: Compared with the algorithm proposed by Reshef et al. in 2011, the proposed algorithm has lower time complexity and needs less computation time, so it is more suitable for big data environment.
Abstract: The maximal information coefficient (MIC), a measure of dependence for two-variable relationships, can be used to discover the relationships between two variables in big data. This paper proposes a new mathematical program model for calculating the value of MIC. A corresponding efficient algorithm is designed to solve the model in big data environment. In order to illustrate the validity of the proposed algorithm, the proposed algorithm is applied into the analysis of railway accidents data. Experimental results show that the proposed algorithm could find important relationships between two variables from big data. And some factors influencing accidents are identified from many factors. In addition, compared with the algorithm proposed by Reshef et al. in 2011, the proposed algorithm has lower time complexity and needs less computation time. Hence the proposed algorithm is more suitable for big data environment.
TL;DR: The experiments’ results show that the proposed method is capable of detecting and adjusting to concept drifts from different types, and it has outperformed well-known state-of-the-art methods, especially, in the case of high-speed conceptdrifts.
Abstract: Concept drift, change in the underlying distribution that data points come from, is an inevitable phenomenon in data streams. Due to increase in the number of data streams’ applications such as network intrusion detection, weather forecasting, and detection of unconventional behavior in financial transactions; numerous researches have recently been conducted in the area of concept drift detection. An ideal method for concept drift detection should be able to rapidly and correctly identify changes in the underlying distribution of data points and adapt its model as quickly as possible while the memory and processing time is limited. In this paper, we propose a novel explicit method based on ensemble classifiers for detecting concept drift. The method processes samples one by one, and monitors the distribution of ensemble’s error in order to detect probable drifts. After detection of a drift, a new classifier will be trained on the new concept in order to keep the model up-to-date. The proposed method has been evaluated on some artificial and real benchmark data sets. The experiments’ results show that the proposed method is capable of detecting and adjusting to concept drifts from different types, and it has outperformed well-known stateof-the-art methods. Especially, in the case of high-speed concept drifts.
TL;DR: Hierarchical clustering is compared in the experiments with classical learning algorithms showing a similar performance when considering the estimation of a joint probability distribution for all the variables, but with a clear advantage: the simplicity and easiness of the interpretation of the model.
Abstract: The use of graphical probabilistic models in the field of education has been considered for this research. First, classical learning algorithms, as PC or K2 are reviewed. But the problem with these general learning procedures comes from the presence of a high number of variables that measure different aspects of the same concept, as it can be the case of socio- economic indicators in a population. In this case, we have that all the variables have some degree of dependence among them, without a true causal structure. So, a new procedure is presented which makes a hierarchical clustering of the data while learning a joint probability distribution. It generalizes AutoClass EM clustering allowing more complex models. Hierarchical clustering is compared in the experiments with classical learning algorithms showing a similar performance when considering the estimation of a joint probability distribution for all the variables, but with a clear advantage: the simplicity and easiness of the interpretation of the model. The method is applied to the analysis of two datasets of the educational data: socio-economic, academic achievement and drop outs at the Engineering Faculty of Quevedo State Technical University, and student evaluation of teachers from Gazi University in Ankara (Turkey).
TL;DR: A reinforcement learning based approach to tackle the cost-sensitive learning problem where each input feature has a specific cost and relies on representation learning to enable performing prediction on any partially observed sample, whatever the set of its observed features are.
Abstract: We propose a reinforcement learning based approach to tackle the cost-sensitive learning problem where each input feature has a specific cost. The acquisition process is handled through a stochastic policy which allows features to be acquired in an adaptive way. The general architecture of our approach relies on representation learning to enable performing prediction on any partially observed sample, whatever the set of its observed features are. The resulting model is an original mix of representation learning and of reinforcement learning ideas. It is learned with policy gradient techniques to minimize a budgeted inference cost. We demonstrate the effectiveness of our proposed method with several experiments on a variety of datasets for the sparse prediction problem where all features have the same cost, but also for some cost-sensitive settings.
TL;DR: This paper categorises the large set of proposed link prediction features based on their topological scope, and shows that the contribution of particular categories of features can actually be explained by simple structural properties of the network.
Abstract: Data that involves some sort of relationship or interaction can be represented, modelled and analyzed using the notion of a network. To understand the dynamics of networks, the link prediction problem is concerned with predicting the evolution of the topology of a network over time. Previous work in this direction has largely focussed on finding an extensive set of features capable of predicting the formation of a link, often within some domain-specific context. This sometimes results in a “black box” type of approach in which it is unclear how the (often computationally expensive) features contribute to the accuracy of the final predictor. This paper counters these problems by categorising the large set of proposed link prediction features based on their topological scope, and showing that the contribution of particular categories of features can actually be explained by simple structural properties of the network. An approach called the Efficient Feature Set is presented that uses a limited but explainable set of computationally efficient features that within each scope captures the essential network properties. Its performance is experimentally verified using a large number of diverse real-world network datasets. The result is a generic approach suitable for consistently predicting links with high accuracy.
TL;DR: A Statistical Relational Learning approach to Workflow Mining that takes into account both flexibility and uncertainty in real environments and is able to better classify new execution traces, showing higher accuracy and areas under the PR/ROC curves in most cases.
Abstract: The management of business processes can support efficiency improvements in organizations. One of the most interesting problems is the mining and representation of process models in a declarative language. Various recently proposed knowledge-based languages showed advantages over graph-based procedural notations. Moreover, rapid changes of the environment require organizations to check how compliant are new process instances with the deployed models. We present a Statistical Relational Learning approach to Workflow Mining that takes into account both flexibility and uncertainty in real environments. It performs automatic discovery of process models expressed in a probabilistic logic. It uses the existing DPML algorithm for extracting first-order logic constraints from process logs. The constraints are then translated into Markov Logic to learn their weights. Inference on the resulting Markov Logic model allows a probabilistic classification of test traces, by assigning them the probability of being compliant to the model. We applied this approach to three datasets and compared it with DPML alone, five Petri netand EPC-based process mining algorithms and Tilde. The technique is able to better classify new execution traces, showing higher accuracy and areas under the PR/ROC curves in most cases.
TL;DR: The proposed solution performs better in terms of the given challenge metric compared to the traditional classification methods such as SVM, AdaBoost or Random Forests.
Abstract: In this paper, we describe our solution for the machine learning prediction challenge in IDA 2016. For the given problem of 2-class classification on an imbalanced dataset with missing data, we first develop an imputation method based on k-NN to estimate the missing values. Then we define a tailored representation for the given problem as an optimization scheme, which consists of learned distance and voting weights for k-NN classification. The proposed solution performs better in terms of the given challenge metric compared to the traditional classification methods such as SVM, AdaBoost or Random Forests.
TL;DR: This paper approaches the myopia of the State-of-the-Art method RReliefF on mining relevant inter-relationships of the feature space relevant for reducing the entropy around the target variable on regression tasks.
Abstract: Long-term travel time predictions are crucial for tactical and operational public transport planning in schedule design and resource allocation tasks. Similarly to any regression task, its success considerably depend on an adequate feature selection framework. In this paper, we approach the myopia of the State-of-the-Art method RReliefF on mining relevant inter-relationships of the feature space relevant for reducing the entropy around the target variable on regression tasks. A comparative study was conducted using baseline regression methods and LASSO as a valid alternative to RReliefF. Experimental results obtained on a real-world case study uncovered the bias/variance reduction obtained by each approach, pointing out promising ideas on this research line.
TL;DR: The results indicate that sequential pattern mining can significantly improve pattern-based representations, even in a completely unsupervised setting.
Abstract: This paper deals with the extraction of semantic relations from scientific texts. Pattern-based representations are compared to word embeddings in unsupervised clustering experiments, according to their potential to discover new types of semantic relations and recognize their instances. The results indicate that sequential pattern mining can significantly improve pattern-based representations, even in a completely unsupervised setting.
TL;DR: New optimization method based on neighbor heuristic and Gaussian cloud learning is introduced in order to improve the performance of traditional PSO (NHPSO), which is superior to the recent variants of PSO in terms of convergence speed, solution accuracy, algorithm efficiency and robustness.
Abstract: The Particle Swarm Optimization (PSO) is a heuristic optimization technique-based swarm intelligence that can be applied to solving many real-world optimization problems. However, the standard PSO algorithm can easily get trapped in the local optima and has slow convergence speed, and these drawbacks have hindered its further development in all fields. In this paper, a new optimization method based on neighbor heuristic and Gaussian cloud learning is introduced in order to improve the performance of traditional PSO (NHPSO). The NHPSO consists of two main steps. First, by analyzing the relationship among particles in the evolutionary process, a neighbor heuristic mechanism is performed to improve the search efficiency and convergence speed. In addition, a Gaussian cloud learning strategy is introduced to enhance population diversity and balance the global and local search abilities. The performance of the NHPSO is tested using 12 benchmark functions and 6 shifted functions. Results show that NHPSO is superior to the recent variants of PSO in terms of convergence speed, solution accuracy, algorithm efficiency and robustness.
TL;DR: A new methodology based on Principal Component Analysis and Logistics Regression is proposed that enables the selection of particular genes that are relevant for classification in DNA microarrays.
Abstract: DNA microarrays is a technology that can be used to diagnose cancer and other diseases. To automate the analysis of such data, pattern recognition and machine learning algorithms can be applied. However, the curse of dimensionality is unavoidable: very few samples to train, and many attributes in each sample. As the predictive accuracy of supervised classifiers decays with irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. The main idea is to retain only the genes that are the most influential in the classification of the disease. In this paper, a new methodology based on Principal Component Analysis and Logistics Regression is proposed. Our method enables the selection of particular genes that are relevant for classification. Experiments were run using eight different classifiers on two benchmark datasets: Leukemia and Lymphoma. The results show that our method not only reduces the number of required attributes, but also increase the classification accuracy in more than 10% in all the cases we tested.
TL;DR: This paper goes back to the binary discernibility matrix and reformulate some basic concepts that allow for an algorithm for computing reducts and defines a more compact and ordered matrix, which allows reducing even more the amount of candidates to be evaluated and the cost of verifying each one of them.
Abstract: Attribute reduction is a very important task in Rough Set Theory. In this context, in recent years several attribute reduction algorithms based on the discernibility matrix have been proposed. In this paper, we go back to the binary discernibility matrix and reformulate some basic concepts that allow us to build on them an algorithm for computing reducts. The proposed algorithm takes advantage of the binary representation used for the discernibility matrix and depending on the fulfillment of certain pruning properties several candidates can be jumped (pruned). In this way, the number of candidates to be tested is reduced. Additionally, from the binary discernibility matrix, we define a more compact and ordered matrix, which allows reducing even more the amount of candidates to be evaluated, and the cost of verifying each one of them. The proposed algorithm is able to compute all the reducts of an information table, but it can also be used to find a reduct, or a certain number of them. Experiments over standard datasets show that our algorithm has a good performance.
TL;DR: DSCo-NG is proposed, which reduces DSCo’s complexity and offers an efficient (linear time complexity and low memory footprint), accurate, accurate and generic approach for time series classification.
Abstract: The abundance of time series data in various domains and their high dimensionality characteristic are challenging for harvesting useful information from them. To tackle storage and processing challenges, compression-based techniques have been proposed. Our previous work, Domain Series Corpus (DSCo), compresses time series into symbolic strings and takes advantage of language modeling techniques to extract from the training set knowledge about different classes. However, this approach was flawed in practice due to its excessive memory usage and the need for a priori knowledge about the dataset. In this paper we propose DSCo-NG, which reduces DSCo’s complexity and offers an efficient (linear time complexity and low memory footprint), accurate (performance comparable to approaches working on uncompressed data) and generic (so that it can be applied to various domains) approach for time series classification. Our confidence is backed with extensive experimental evaluation against publicly accessible datasets, which also offers insights on when DSCo-NG can be a better choice than others.
TL;DR: It is shown experimentally that the decision trees resulted from the proposed FMMDT learning algorithm achieve the highest accuracy and the lowest size and depth in comparison with C4.5, BFTree, SimpleCart and NBTree on the most commonly used UCI data sets.
Abstract: This paper presents a new decision tree learning algorithm, fuzzy min-max decision tree (FMMDT) based on fuzzy min-max neural networks. In contrast with traditional decision trees in which a single attribute is selected as the splitting test, the internal nodes of the proposed algorithm contain a fuzzy min-max neural network. In the proposed learning algorithm, the flexibility inherent in the fuzzy logic and the computational efficiency of the min-max neural networks are combined in the decision tree learning framework. FMMDT splits the feature space non-linearly based on multiple attributes which provides not only conceptually more insightful splits but also decision trees with smaller size and depth. The decision trees resulted from the FMMDT learning algorithm have a non-traditional architecture, which enables determining the class label of the instances as early as possible. Moreover, FMMDT creates decision trees which are interpretable by the domain expert. It is shown experimentally that the decision trees resulted from the proposed FMMDT learning algorithm achieve the highest accuracy and the lowest size and depth in comparison with C4.5, BFTree, SimpleCart and NBTree on the most commonly used UCI data sets. Moreover, the experiments reveal that FMMDT creates decision trees with stable structure.
TL;DR: An iterative self-training Support Vector Machine (SVM) algorithm combined feature re-extraction is proposed for semi-supervised learning, which only needs a small set of labeled samples to train classifier and is thus very useful in Brain-Computer Interface (BCI) design.
Abstract: In this paper, an iterative self-training Support Vector Machine (SVM) algorithm combined feature re-extraction is proposed for semi-supervised learning, which only needs a small set of labeled samples to train classifier and is thus very useful in Brain-Computer Interface (BCI) design. Two methods, the model selection based self-training and the confidence criterion, respectively, is also proposed for searching the best parameter pair of SVM and selecting the most useful unlabeled data to expand the labeled training data set. The Dataset IVa of BCI Competition III, is presented to demonstrate the validity of our algorithm with statistical significance test. As an iterative algorithm, experimental results of the proposed algorithm show the validity of re-extracting feature and the robustness of the feature to the noise. In addition, the convergence of the proposed algorithm and the validity of the method measuring the consistency of the feature are also demonstrated in experiments.
TL;DR: A new approach to mine dependencies between streams of interval-based events that links two events if they occur in a similar manner, one being often followed by the other one after a certain time interval in the data, which is robust to temporal variability of events.
Abstract: Pattern mining over data streams is critical to a variety of applications such as understanding and predicting weather phenomena or outdoor surveillance. Most of the current techniques attempt to discover relationships between time-point events but are not practical for discovering dependencies over interval-based events. In this work, we present a new approach to mine dependencies between streams of interval-based
events that links two events if they occur in a similar manner, one being often followed by the other one after a certain time interval in the
data. The proposed method is robust to temporal variability of events and determines the most appropriate time intervals whose validity is assessed by a Chi2 test. As several intervals may redundantly describe the same dependency, the approach retrieves only the most speci c intervals with respect to a dominance relationship over temporal dependencies, and thus avoids the classical problem of pattern flooding in data mining. The TEDDY algorithm, TEmporal Dependency DiscoverY, prunes the search space while guaranteeing the discovery of all valid and signifi cant temporal dependencies. We present empirical results on simulated and real-life data to show the scalability and the robustness of our approach. The
dependency relationships defi ne a graph that supports intelligent analysis as illustrated by two case studies: Outdoor surveillance of a building via
video camera and motion sensors, and assistance for road deicing operations based on the humidity and temperature measurements at the urban
scale. These applications demonstrate the eficiency and the e fectiveness of our approach.
TL;DR: Experimental results indicate the superiority of the proposed FKH optimization algorithm in comparison with the standard KH optimization algorithm.
Abstract: Krill Herd (KH) optimization algorithm was recently proposed based on herding behavior of krill individuals in the nature for solving optimization problems. In this paper, we develop Standard Krill Herd (SKH) algorithm and propose Fuzzy Krill Herd (FKH) optimization algorithm which is able to dynamically adjust the participation amount of exploration and exploitation by looking the progress of solving the problem in each step. In order to evaluate the proposed FKH algorithm, we utilize some standard benchmark functions and also Inventory Control Problem. Experimental results indicate the superiority of our proposed FKH optimization algorithm in comparison with the standard KH optimization algorithm.
TL;DR: One of the local matching algorithms is improved, a complete real-time system for calculating depth image with different SAD (Sum of Absolute Differences) matching window in different texture regions is built and the quality of the depth map calculated from this method is somewhat improved.
Abstract: The depth measurement according stereo vision is a very popular method of measuring the depth information. But the calculation in stereo matching is large, time-consuming and in large errors in matching, when used in real-time systems to obtain depth information are often ineffective. This paper improves one of the local matching algorithms, builds a complete real-time system for calculating depth image with different SAD (Sum of Absolute Differences) matching window in different texture regions. In this paper, we capture images from the Bumblebee2 stereoscopic camera which mounted on a small unmanned car, then use Matlab calibrating the camera and do the calculation to correct the raw images and do stereo matching in VS2010 to get the depth map. This method is simple, real-time performance and adaptability, and the quality of the depth map calculated from this method is somewhat improved.
TL;DR: This work takes an information-theoretic approach to analyzing data from the WAIS Divide ice core, the longest continuous and highest-resolution water isotope record yet recovered from Antarctica, using weighted permutation entropy to calculate the Shannon entropy rate from these isotope measurements.
Abstract: Paleoclimate records are extremely rich sources of information about the past history of the Earth system. We take an information-theoretic approach to analyzing data from the WAIS Divide ice core, the longest continuous and highest-resolution water isotope record yet recovered from Antarctica. We use weighted permutation entropy to calculate the Shannon entropy rate from these isotope measurements, which are proxies for a number of different climate variables, including the temperature at the time of deposition of the corresponding layer of the core. We find that the rate of information production in these measurements reveals issues with analysis instruments, even when those issues leave no visible traces in the raw data. These entropy calculations also allow us to identify a number of intervals in the data that may be of direct relevance to paleoclimate interpretation, and to form new conjectures about what is happening in those intervals—including periods of abrupt climate change.