TL;DR: The colon CAD system has been validated on the largest set of data to date, and demonstrates excellent performance, in terms of its high sensitivity, low false positive rate, and computational efficiency.
Abstract: We present a complete, end-to-end computer-aided detection (CAD) system for identifying lesions in the colon, imaged with computed tomography (CT). This system includes facilities for colon segmentation, candidate generation, feature analysis, and classification. The algorithms have been designed to offer robust performance to variation in image data and patient preparation. By utilizing efficient 2D and 3D processing, software optimizations, multi-threading, feature selection, and an optimized cascade classifier, the CAD system quickly determines a set of detection marks. The colon CAD system has been validated on the largest set of data to date, and demonstrates excellent performance, in terms of its high sensitivity, low false positive rate, and computational efficiency.
TL;DR: A new environment, integrating different types of pedagogical approaches, resources, tools and technologies for programming learning support is presented, to support students with more difficulties and will provide a set of resources supporting the learning of more advanced topics.
Abstract: In recent years, many tools have been proposed to reduce programming learning difficulties felt by many students. Our group has contributed to this effort through the development of several tools, such as VIP, SICAS, OOP-Anim, SICAS-COL and H-SICAS. Even though we had some positive results, the utilization of these tools doesn’t seem to significantly reduce weaker student’s difficulties. These students need stronger support to motivate them to get engaged in learning activities, inside and outside classroom. Nowadays, many technologies are available to create contexts that may help to accomplish this goal. We consider that a promising path goes through the integration of solutions. In this paper we analyze the features, strengths and weaknesses of the tools developed by our group. Based on these considerations we present a new environment, integrating different types of pedagogical approaches, resources, tools and technologies for programming learning support. With this environment, currently under development, it will be possible to review contents and lessons, based on video and screen captures. The support for collaborative tasks is another key point to improve and stimulate different models of teamwork. The platform will also allow the creation of various alternative models (learning objects) for the same subject, enabling personalized learning paths adapted to each student knowledge level, needs and preferential learning styles. The learning sequences will work as a study organizer, following a suitable taxonomy, according to student’s cognitive skills. Although the main goal of this environment is to support students with more difficulties, it will provide a set of resources supporting the learning of more advanced topics. Software engineering techniques and representations, object orientation and event programming are features that will be available in order to promote the learning progress of students.
TL;DR: In this paper, Hutter extended Ockham's razor principle to more practical (partial, approximate, probabilistic, parametric) world models (rather than theories of everything) and criticised the anthropic principle, the doomsday argument, the no free lunch theorem, and the falsibility dogma.
Abstract: Marcus HutterSoCS, RSISE, IAS, CECS, Australian National University, Canberra, ACT, 0200, Australia;E-Mail: marcus.hutter@anu.edu.auReceived: 30 August 2010 / Accepted: 22 September 2010 / Published: 29 September 2010Abstract: Increasingly encompassing models have been suggested for our world. Theoriesrange from generally accepted to increasingly speculative to apparently bogus. Theprogression of theories from ego- to geo- to helio-centric models to universe and multiversetheories and beyond was accompanied by a dramatic increase in the sizes of the postulatedworlds, with humans being expelled from their center to ever more remote and randomlocations. Rather than leading to a true theory of everything, this trend faces a turning pointafter which the predictive power of such theories decreases (actually to zero). Incorporatingthe location and other capacities of the observer into such theories avoids this problemand allows to distinguish meaningful from predictively meaningless theories. This alsoleads to a truly complete theory of everything consisting of a (conventional objective)theory of everything plus a (novel subjective) observer process. The observer localizationis neither based on the controversial anthropic principle, nor has it anything to do withthe quantum-mechanical observation process. The suggested principle is extended to morepractical (partial, approximate, probabilistic, parametric) world models (rather than theoriesof everything). Finally, I provide a justification of Ockham’s razor, and criticize the anthropicprinciple, the doomsday argument, the no free lunch theorem, and the falsifiability dogma.Keywords: world models; observer localization; predictive power; Ockham’s razor;universal theories; inductive reasoning; simplicity and complexity; universal self-sampling;no-free-lunch; computability
TL;DR: A web-based programming task database is used as an easy and risk-free environment for taking the first steps in programming Java and the Animal algorithm visualization system is used to visualize the dynamic behavior of algorithms and data structures.
Abstract: Both learning how to program and understanding algorithms or data structures are often difficult. This paper presents three complementary approaches that we employ to help our students in learning to program, especially during the first term of their study. We use a web-based programming task database as an easy and risk-free environment for taking the first steps in programming Java. The Animal algorithm visualization system is used to visualize the dynamic behavior of algorithms and data structures. We complement both approaches with tutorial videos on using the Eclipse IDE. We also report on the experiences with this combined approach.
TL;DR: This article shows how to provide full support to the analysis of recursive algorithms in the SRec system, enriched with interaction techniques inspired by the information visualization (InfoVis) field.
Abstract: Algorithm animations typically assist in educational tasks aimed simply at achieving understanding. Potentially, animations could assist in higher levels of cognition, such as the analysis level, but they usually fail in providing this support because they are not flexible or comprehensive enough. In particular, animations of recursion provided by educational systems hardly support the analysis of recursive algorithms. Here we show how to provide full support to the analysis of recursive algorithms. From a technical point of view, animations are enriched with interaction techniques inspired by the information visualization (InfoVis) field. Interaction tasks are presented in seven categories, and deal with both static visualizations and dynamic animations. All of these features are implemented in the SRec system, and visualizations generated by SRec are used to illustrate the article.
TL;DR: To calculate second-derivative-based 5-point-window L1 splines, an analysis-based, parallelizable algorithm is introduced that is orders of magnitude faster than the previously widely used primal affine algorithm.
Abstract: We compare univariate L1 interpolating splines calculated on 5-point windows, on 7-point windows and on global data sets using four different spline functionals, namely, ones based on the second derivative, the first derivative, the function value and the antiderivative. Computational results indicate that second-derivative-based 5-point-window L1 splines preserve shape as well as or better than the other types of L1 splines. To calculate second-derivative-based 5-point-window L1 splines, we introduce an analysis-based, parallelizable algorithm. This algorithm is orders of magnitude faster than the previously widely used primal affine algorithm.
TL;DR: The experimental results obtained from iteratively applying WP-SVM to improve detection sensitivity demonstrate its viability for incremental learning, thereby motivating further follow on research to address a wider range of true positive subclasses such as pedunculated, sessile, and flat polyps.
Abstract: We present in this paper a novel dynamic learning method for classifying polyp candidate detections in Computed Tomographic Colonography (CTC) using an adaptation of the Least Square Support Vector Machine (LS-SVM). The proposed technique, called Weighted Proximal Support Vector Machines (WP-SVM), extends the offline capabilities of the SVM scheme to address practical CTC applications. Incremental data are incorporated in the WP-SVM as a weighted vector space, and the only storage requirements are the hyperplane parameters. WP-SVM performance evaluation based on 169 clinical CTC cases using a 3D computer-aided diagnosis (CAD) scheme for feature reduction comparable favorably with previously published CTC CAD studies that have however involved only binary and offline classification schemes. The experimental results obtained from iteratively applying WP-SVM to improve detection sensitivity demonstrate its viability for incremental learning, thereby motivating further follow on research to address a wider range of true positive subclasses such as pedunculated, sessile, and flat polyps, and over a wider range of false positive subclasses such as folds, stool, and tagged materials.
TL;DR: Analytically investigate univariate C1 continuous cubic L1 interpolating splines calculated by minimizing an L1 spline functional based on the second derivative on 5-point windows and links geometric properties of the data points in the windows with linearity, convexity and oscillation property of the resulting L1spline.
Abstract: We analytically investigate univariate C1 continuous cubic L1 interpolating splines calculated by minimizing an L1 spline functional based on the second derivative on 5-point windows. Specifically, we link geometric properties of the data points in the windows with linearity, convexity and oscillation properties of the resulting L1 spline. These analytical results provide the basis for a computationally efficient algorithm for calculation of L1 splines on 5-point windows.
TL;DR: A new strategy for Magnus is presented that succeeds in visiting the maximal number of positions in 3(n – 1) rounds, which is the optimal number of rounds up to a constant factor.
Abstract: We analyze further the Magnus-Derek game, a two-player game played on a round table with n positions. The players jointly control the movement of a token. One player, Magnus, aims to maximize the number of positions visited while minimizing the number of rounds. The other player, Derek, attempts to minimize the number of visited positions. We present a new strategy for Magnus that succeeds in visiting the maximal number of positions in 3(n – 1) rounds, which is the optimal number of rounds up to a constant factor.
TL;DR: A novel recognition method of pulmonary nodules in thoracic computed tomography scans is described by use of three-dimensional spherical and cylindrical models that represent nodules and blood vessels, respectively, based on the Bayes theorem.
Abstract: The present paper describes a novel recognition method of pulmonary nodules (i.e., cancer candidates) in thoracic computed tomography scans by use of three-dimensional spherical and cylindrical models that represent nodules and blood vessels, respectively. The anatomical validity of these object models and their fidelity to computed tomography scans are evaluated based on the Bayes theorem. The nodule recognition is employed by the maximum a posteriori estimation. The proposed method is applied to 26 actual computed tomography scans, and experimental results are shown.
TL;DR: Storjohann’s modular arithmetic approach with the segment-LLL approach is combined to further improve the worst case complexity of the segment -LLL algorithms by a factor of n0.5.
Abstract: Author to whom correspondence should be addressed; E-Mail: mehrotra@iems.northwestern.edu;Tel.: +1-847-491-3155; Fax: +1-847-491-8005.Received: 28 May 2010 / Accepted: 29 June 2010 / Published: 12 July 2010Abstract: The algorithm of Lenstra, Lenstra, and Lov´asz (LLL) transforms a given integerlattice basis into a reduced basis. Storjohann improved the worst case complexity of LLLalgorithms by a factor of O(n) using modular arithmetic. Koy and Schnorr developeda segment-LLL basis reduction algorithm that generates lattice basis satisfying a weakercondition than the LLL reduced basis with O(n) improvement than the LLL algorithm. Inthis paper we combine Storjohann’s modular arithmetic approach with the segment-LLLapproach to further improve the worst case complexity of the segment-LLL algorithms by afactor of n
TL;DR: A car traffic simulation prototype for complex networks, that is formed by a collection of roads and junctions, described by a model based on fluid dynamic conservation laws, deduced from conservation of the number of cars.
Abstract: We present a car traffic simulation prototype for complex networks, that is formed by a collection of roads and junctions. Traffic load evolution is described by a model based on fluid dynamic conservation laws, deduced from conservation of the number of cars. The model contains some additional hypothesis in order to reproduce specific car traffic features such as route based car distribution at nodes and the presence of right-of-way at the crossroads. A complete implementation of this model is then presented, together with computational results on case studies.
TL;DR: This paper reviews previous work that applies interactive approaches to data compression and discusses the possibility of substituting entropy with conditional entropy in the fundamental source coding theorem to have a new theoretical limit that allows for better compression.
Abstract: If we can use previous knowledge of the source (or the knowledge of a source that is correlated to the one we want to compress) to exploit the compression process then we can have significant gains in compression. By doing this in the fundamental source coding theorem we can substitute entropy with conditional entropy and we have a new theoretical limit that allows for better compression. To do this, when data compression is used for data transmission, we can assume some degree of interaction between the compressor and the decompressor that can allow a more efficient usage of the previous knowledge they both have of the source. In this paper we review previous work that applies interactive approaches to data compression and discuss this possibility.
TL;DR: The algorithm is based on the joint solution of a system of two partial differential equations and provides strong solutions for finite-dimensional systems of SDEs driven by standard Wiener processes and with adapted initial data.
Abstract: This brief note presents an algorithm to solve ordinary stochastic differential equations (SDEs). The algorithm is based on the joint solution of a system of two partial differential equations and provides strong solutions for finite-dimensional systems of SDEs driven by standard Wiener processes and with adapted initial data. Several examples illustrate its use.
TL;DR: The results show that highly biodegradable oils can be better predicted through numeric models than classification models, and a simple classification rule derived based on this predictor resulted in good classification accuracy.
Abstract: In this paper, we apply various data mining techniques including continuous numeric and discrete classification prediction models of base oils biodegradability, with emphasis on improving prediction accuracy. The results show that highly biodegradable oils can be better predicted through numeric models. In contrast, classification models did not uncover a similar dichotomy. With the exception of Memory Based Reasoning and Decision Trees, tested classification techniques achieved high classification prediction. However, the technique of Decision Trees helped uncover the most significant predictors. A simple classification rule derived based on this predictor resulted in good classification accuracy. The application of this rule enables efficient classification of base oils into either low or high biodegradability classes with high accuracy. For the latter, a higher precision biodegradability prediction can be obtained using continuous modeling techniques.
TL;DR: This work introduces an algorithm that applies tools of computational geometry to the computation of the metric average of 2D sets with piecewise linear boundaries.
Abstract: The metric average is a binary operation between sets in Rn which is used in the approximation of set-valued functions. We introduce an algorithm that applies tools of computational geometry to the computation of the metric average of 2D sets with piecewise linear boundaries.
TL;DR: Solomonoff as mentioned in this paper was one of the first researchers to treat probabilistic grammars and the associated languages and treated probabilistically Artificial Intelligence (AI) when "probabilistic" was unfashionable, and treated questions of machine learning early on.
Abstract: Ray J. Solomonoff died on December 7, 2009, in Cambridge, Massachusetts, of complications of a stroke caused by an aneurism in his head. Ray was the first inventor of Algorithmic Information Theory which deals with the shortest effective description length of objects and is commonly designated by the term “Kolmogorov complexity.” In the 1950s Solomonoff was one of the first researchers to treat probabilistic grammars and the associated languages. He treated probabilistic Artificial Intelligence (AI) when “probabilistic” was unfashionable, and treated questions of machine learning early on. But his greatest contribution is the creation of Algorithmic Information Theory. [...]
TL;DR: The properties of the vertex that is numbered 1 by MLS on a chordal graph and by MLSM on an arbitrary graph are investigated to show the remarkable property that the minimal separators included in the neighborhood of this vertex are totally ordered by inclusion.
Abstract: Graph search algorithms have exploited graph extremities, such as the leaves of a tree and the simplicial vertices of a chordal graph. Recently, several well-known graph search algorithms have been collectively expressed as two generic algorithms called MLS and MLSM. In this paper, we investigate the properties of the vertex that is numbered 1 by MLS on a chordal graph and by MLSM on an arbitrary graph. We explain how this vertex is an extremity of the graph. Moreover, we show the remarkable property that the minimal separators included in the neighborhood of this vertex are totally ordered by inclusion.
TL;DR: This work introduces an algorithm for the direct suffix sorting problem with worst case time complexity in O(n), requiring only (1 2 3 n log n - n log | ∑ |+O(1)) bits in memory space, and the basis of this algorithm is an extension of Shannon-Fano-Elias codes used in source coding and information theory.
Abstract: Given a sequence T = t0t1 . . . tn-1 of size n = |T|, with symbols from a fixed alphabet Σ, (|Σ| ≤ n), the suffix array provides a listing of all the suffixes of T in a lexicographic order. Given T, the suffix sorting problem is to construct its suffix array. The direct suffix sorting problem is to construct the suffix array of T directly without using the suffix tree data structure. While algorithims for linear time, linear space direct suffix sorting have been proposed, the actual constant in the linear space is still a major concern, given that the applications of suffix trees and suffix arrays (such as in whole-genome analysis) often involve huge data sets. In this work, we reduce the gap between current results and the minimal space requirement. We introduce an algorithm for the direct suffix sorting problem with worst case time complexity in O(n), requiring only (1 2 3 n log n - n log | ∑ |+O(1)) bits in memory space. This implies 5 2 3 n+O(1) bytes for total space requirment, (including space for both the output suffix array and the input sequence T) assuming n ≤ 2 32 ,| ∑ |≤256 , and 4 bytes per integer. The basis of our algorithm is an extension of Shannon-Fano-Elias codes used in source coding and information theory. This is the first time information-theoretic methods have been used as the basis for solving the suffix sorting problem.
TL;DR: The experimental results tested on the DDSM database show the promises of GCD algorithm in breast cancer detection, which achieved TP (true positive rate) = 90% at FPI (false positives per image) = 1.21 in mass detection; and TP = 93% atFPI =1.19 in calcification detection.
Abstract: A new breast cancer detection algorithm, named the “Gabor Cancer Detection” (GCD) algorithm, utilizing Gabor features is proposed. Three major steps are involved in the GCD algorithm, preprocessing, segmentation (generating alarm segments), and classification (reducing false alarms). In preprocessing, a digital mammogram is down-sampled, quantized, denoised and enhanced. Nonlinear diffusion is used for noise suppression. In segmentation, a band-pass filter is formed by rotating a 1-D Gaussian filter (off center) in frequency space, termed as “Circular Gaussian Filter” (CGF). A CGF can be uniquely characterized by specifying a central frequency and a frequency band. A mass or calcification is a space-occupying lesion and usually appears as a bright region on a mammogram. The alarm segments (suspicious to be masses/calcifications) can be extracted out using a threshold that is adaptively decided upon the histogram analysis of the CGF-filtered mammogram. In classification, a Gabor filter bank is formed with five bands by four orientations (horizontal, vertical, 45 and 135 degree) in Fourier frequency domain. For each mammographic image, twenty Gabor-filtered images are produced. A set of edge histogram descriptors (EHD) are then extracted from 20 Gabor images for classification. An EHD signature is computed with four orientations of Gabor images along each band and five EHD signatures are then joined together to form an EHD feature vector of 20 dimensions. With the EHD features, the fuzzy C-means clustering technique and k-nearest neighbor (KNN) classifier are used to reduce the number of false alarms. The experimental results tested on the DDSM database (University of South Florida) show the promises of GCD algorithm in breast cancer detection, which achieved TP (true positive rate) = 90% at FPI (false positives per image) = 1.21 in mass detection; and TP = 93% at FPI = 1.19 in calcification detection.
TL;DR: This paper is a review which presents and explains the decomposition of graphs by clique minimal separators, and provides easy algorithms to implement this decomposition.
Abstract: This paper is a review which presents and explains the decomposition of graphs by clique minimal separators. The pace is leisurely, we give many examples and figures. Easy algorithms are provided to implement this decomposition. The historical and theoretical background is given, as well as sketches of proofs of the structural results involved.