Top 132 IEEE Transactions on Pattern Analysis and Machine Intelligence papers published in 2000

Showing papers in "IEEE Transactions on Pattern Analysis and Machine Intelligence in 2000"

A flexible new technique for camera calibration

[...]

ZhenQiu Zhang¹•Institutions (1)

01 Nov 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A flexible technique to easily calibrate a camera that only requires the camera to observe a planar pattern shown at a few (at least two) different orientations is proposed and advances 3D computer vision one more step from laboratory environments to real world use.

...read moreread less

Abstract: We propose a flexible technique to easily calibrate a camera. It only requires the camera to observe a planar pattern shown at a few (at least two) different orientations. Either the camera or the planar pattern can be freely moved. The motion need not be known. Radial lens distortion is modeled. The proposed procedure consists of a closed-form solution, followed by a nonlinear refinement based on the maximum likelihood criterion. Both computer simulation and real data have been used to test the proposed technique and very good results have been obtained. Compared with classical techniques which use expensive equipment such as two or three orthogonal planes, the proposed technique is easy to use and flexible. It advances 3D computer vision one more step from laboratory environments to real world use.

...read moreread less

15,641 citations

Journal Article•10.1109/34.868688•

Normalized cuts and image segmentation

[...]

Jianbo Shi¹, Jitendra Malik²•Institutions (2)

Carnegie Mellon University¹, University of California, Berkeley²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.

...read moreread less

Abstract: We propose a novel approach for solving the perceptual grouping problem in vision. Rather than focusing on local features and their consistencies in the image data, our approach aims at extracting the global impression of an image. We treat image segmentation as a graph partitioning problem and propose a novel global criterion, the normalized cut, for segmenting the graph. The normalized cut criterion measures both the total dissimilarity between the different groups as well as the total similarity within the groups. We show that an efficient computational technique based on a generalized eigenvalue problem can be used to optimize this criterion. We applied this approach to segmenting static images, as well as motion sequences, and found the results to be very encouraging.

...read moreread less

15,625 citations

Journal Article•10.1109/34.824819•

Statistical pattern recognition: a review

[...]

Anil K. Jain¹, Robert P. W. Duin², Jianchang Mao³•Institutions (3)

Michigan State University¹, Delft University of Technology², IBM³

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

Abstract: The primary goal of pattern recognition is supervised or unsupervised classification. Among the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been most intensively studied and used in practice. More recently, neural network techniques and methods imported from statistical learning theory have been receiving increasing attention. The design of a recognition system requires careful attention to the following issues: definition of pattern classes, sensing environment, pattern representation, feature extraction and selection, cluster analysis, classifier design and learning, selection of training and test samples, and performance evaluation. In spite of almost 50 years of research and development in this field, the general problem of recognizing complex patterns with arbitrary orientation, location, and scale remains unsolved. New and emerging applications, such as data mining, web searching, retrieval of multimedia data, face recognition, and cursive handwriting recognition, require robust and efficient pattern recognition techniques. The objective of this review paper is to summarize and compare some of the well-known methods used in various stages of a pattern recognition system and identify research topics and applications which are at the forefront of this exciting and challenging field.

...read moreread less

7,525 citations

Journal Article•10.1109/34.895972•

Content-based image retrieval at the end of the early years

[...]

Arnold W. M. Smeulders¹, Marcel Worring¹, Simone Santini², Amarnath Gupta², Ramesh Jain - Show less +1 more•Institutions (2)

University of Amsterdam¹, University of California, San Diego²

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap are discussed, as well as aspects of system engineering: databases, system architecture, and evaluation.

...read moreread less

Abstract: Presents a review of 200 references in content-based image retrieval. The paper starts with discussing the working conditions of content-based retrieval: patterns of use, types of pictures, the role of semantics, and the sensory gap. Subsequent sections discuss computational steps for image retrieval systems. Step one of the review is image processing for retrieval sorted by color, texture, and local geometry. Features for retrieval are discussed next, sorted by: accumulative and global features, salient points, object and shape features, signs, and structural combinations thereof. Similarity of pictures and objects in pictures is reviewed for each of the feature types, in close connection to the types and means of feedback the user of the systems is capable of giving by interaction. We briefly discuss aspects of system engineering: databases, system architecture, and evaluation. In the concluding section, we present our view on: the driving force of the field, the heritage from computer vision, the influence on computer vision, the role of similarity and of interaction, the need for databases, the problem of evaluation, and the role of the semantic gap.

...read moreread less

7,039 citations

Journal Article•10.1109/34.879790•

The FERET evaluation methodology for face-recognition algorithms

[...]

P.J. Phillips, Hyeonjoon Moon, Syed A. Rizvi¹, Patrick J. Rauss²•Institutions (2)

College of Staten Island¹, United States Army Research Laboratory²

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems.

...read moreread less

Abstract: Two of the most critical requirements in support of producing reliable face-recognition systems are a large database of facial images and a testing procedure to evaluate systems. The Face Recognition Technology (FERET) program has addressed both issues through the FERET database of facial images and the establishment of the FERET tests. To date, 14,126 images from 1,199 individuals are included in the FERET database, which is divided into development and sequestered portions of the database. In September 1996, the FERET program administered the third in a series of FERET face-recognition tests. The primary objectives of the third test were to 1) assess the state of the art, 2) identify future areas of research, and 3) measure algorithm performance.

...read moreread less

5,101 citations

Journal Article•10.1109/34.824822•

Medical image analysis: progress over two decades and the challenges ahead

[...]

James S. Duncan¹, Nicholas Ayache²•Institutions (2)

Yale University¹, French Institute for Research in Computer Science and Automation²

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A look at progress in the field over the last 20 years is looked at and some of the challenges that remain for the years to come are suggested.

...read moreread less

Abstract: The analysis of medical images has been woven into the fabric of the pattern analysis and machine intelligence (PAMI) community since the earliest days of these Transactions. Initially, the efforts in this area were seen as applying pattern analysis and computer vision techniques to another interesting dataset. However, over the last two to three decades, the unique nature of the problems presented within this area of study have led to the development of a new discipline in its own right. Examples of these include: the types of image information that are acquired, the fully three-dimensional image data, the nonrigid nature of object motion and deformation, and the statistical variation of both the underlying normal and abnormal ground truth. In this paper, we look at progress in the field over the last 20 years and suggest some of the challenges that remain for the years to come.

...read moreread less

4,373 citations

Journal Article•10.1109/34.824819•

Statistical Pattern Recognition

[...]

K JainAnil, P W DuinRobert, MaoJianchang

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the primary goal of pattern recognition is supervised or unsupervised classification, and the various frameworks in which pattern recognition has been traditionally formulated, the statistical approach has been used.

...read moreread less

4,309 citations

Journal Article•10.1109/34.868677•

Learning patterns of activity using real-time tracking

[...]

Chris Stauffer¹, W.E.L. Grimson¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper focuses on motion tracking and shows how one can use observed motion to learn patterns of activity in a site and create a hierarchical binary-tree classification of the representations within a sequence.

...read moreread less

Abstract: Our goal is to develop a visual monitoring system that passively observes moving objects in a site and learns patterns of activity from those observations. For extended sites, the system will require multiple cameras. Thus, key elements of the system are motion tracking, camera coordination, activity classification, and event detection. In this paper, we focus on motion tracking and show how one can use observed motion to learn patterns of activity in a site. Motion segmentation is based on an adaptive background subtraction method that models each pixel as a mixture of Gaussians and uses an online approximation to update the model. The Gaussian distributions are then evaluated to determine which are most likely to result from a background process. This yields a stable, real-time outdoor tracker that reliably deals with lighting changes, repetitive motions from clutter, and long-term scene changes. While a tracking system is unaware of the identity of any object it tracks, the identity remains the same for the entire tracking sequence. Our system leverages this information by accumulating joint co-occurrences of the representations within a sequence. These joint co-occurrence statistics are then used to create a hierarchical binary-tree classification of the representations. This method is useful for classifying sequences, as well as individual instances of activities in a site.

...read moreread less

3,927 citations

Journal Article•10.1109/34.868683•

W/sup 4/: real-time surveillance of people and their activities

[...]

Ismail Haritaoglu¹, D. Harwood², Larry S. Davis²•Institutions (2)

IBM¹, University of Maryland, College Park²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts and to create models of people's appearance so that they can be tracked through interactions such as occlusions.

...read moreread less

Abstract: W/sup 4/ is a real time visual surveillance system for detecting and tracking multiple people and monitoring their activities in an outdoor environment. It operates on monocular gray-scale video imagery, or on video imagery from an infrared camera. W/sup 4/ employs a combination of shape analysis and tracking to locate people and their parts (head, hands, feet, torso) and to create models of people's appearance so that they can be tracked through interactions such as occlusions. It can determine whether a foreground region contains multiple people and can segment the region into its constituent people and track them. W/sup 4/ can also determine whether people are carrying objects, and can segment objects from their silhouettes, and construct appearance models for them so they can be identified in subsequent frames. W/sup 4/ can recognize events between people and objects, such as depositing an object, exchanging bags, or removing an object. It runs at 25 Hz for 320/spl times/240 resolution images on a 400 MHz dual-Pentium II PC.

...read moreread less

3,060 citations

Journal Article•10.1109/34.824821•

Online and off-line handwriting recognition: a comprehensive survey

[...]

Réjean Plamondon¹, Sargur N. Srihari²•Institutions (2)

École Normale Supérieure¹, University at Buffalo²

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms are described.

...read moreread less

Abstract: Handwriting has continued to persist as a means of communication and recording information in day-to-day life even with the introduction of new technologies. Given its ubiquity in human transactions, machine recognition of handwriting has practical significance, as in reading handwritten notes in a PDA, in postal addresses on envelopes, in amounts in bank checks, in handwritten fields in forms, etc. This overview describes the nature of handwritten language, how it is transduced into electronic data, and the basic concepts behind written language recognition algorithms. Both the online case (which pertains to the availability of trajectory data during writing) and the off-line case (which pertains to scanned images) are considered. Algorithms for preprocessing, character and word recognition, and performance with practical systems are indicated. Other fields of application, like signature verification, writer authentification, handwriting learning tools are also considered.

...read moreread less

2,877 citations

Journal Article•10.1109/TPAMI.2000.895971•

A 20th anniversary survey: introduction to "content-based image retrieval at the end of the early years"

[...]

K. Bowyer, P. Flynn

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

Journal Article•10.1109/34.895976•

Automatic analysis of facial expressions: the state of the art

[...]

Maja Pantic¹, Léon J. M. Rothkrantz¹•Institutions (1)

Delft University of Technology¹

01 Dec 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The capability of the human visual system with respect to these problems is discussed, and it is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

...read moreread less

Abstract: Humans detect and interpret faces and facial expressions in a scene with little or no effort. Still, development of an automated system that accomplishes this task is rather difficult. There are several related problems: detection of an image segment as a face, extraction of the facial expression information, and classification of the expression (e.g., in emotion categories). A system that performs these operations accurately and in real time would form a big step in achieving a human-like interaction between man and machine. The paper surveys the past work in solving these problems. The capability of the human visual system with respect to these problems is discussed, too. It is meant to serve as an ultimate goal and a guide for determining recommendations for development of an automatic facial expression analyzer.

...read moreread less

Journal Article•10.1109/34.868684•

A Bayesian computer vision system for modeling human interactions

[...]

Nuria Oliver¹, Barbara Rosario², Alex Pentland³•Institutions (3)

Microsoft¹, University of California, Berkeley², Massachusetts Institute of Technology³

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task and demonstrates the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.

...read moreread less

Abstract: We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in a visual surveillance task. The system deals in particularly with detecting when interactions between people occur and classifying the type of interaction. Examples of interesting interaction behaviors include following another person, altering one's path to meet another, and so forth. Our system combines top-down with bottom-up information in a closed feedback loop, with both components employing a statistical Bayesian approach. We propose and compare two different state-based learning architectures, namely, HMMs and CHMMs for modeling behaviors and interactions. Finally, a synthetic "Alife-style" training system is used to develop flexible prior models for recognizing human interactions. We demonstrate the ability to use these a priori models to accurately classify real human behaviors and interactions with no additional tuning or training.

...read moreread less

Journal Article•10.1109/34.865189•

Assessing a mixture model for clustering with the integrated completed likelihood

[...]

Christophe Biernacki, Gilles Celeux¹, Gérard Govaert²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Technology of Compiègne²

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An assessing method of mixture model in a cluster analysis setting with integrated completed likelihood appears to be more robust to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

Abstract: We propose an assessing method of mixture model in a cluster analysis setting with integrated completed likelihood. For this purpose, the observed data are assigned to unknown clusters using a maximum a posteriori operator. Then, the integrated completed likelihood (ICL) is approximated using the Bayesian information criterion (BIC). Numerical experiments on simulated and real data of the resulting ICL criterion show that it performs well both for choosing a mixture model and a relevant number of clusters. In particular, ICL appears to be more robust than BIC to violation of some of the mixture model assumptions and it can select a number of dusters leading to a sensible partitioning of the data.

...read moreread less

Journal Article•10.1109/34.879788•

Geometric camera calibration using circular control points

[...]

Janne Heikkilä¹•Institutions (1)

University of Oulu¹

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A calibration procedure for precise 3D computer vision applications is described that introduces bias correction for circular control points and a nonrecursive method for reversing the distortion model and indicates improvements in the calibration results in limited error conditions.

...read moreread less

Abstract: Modern CCD cameras are usually capable of a spatial accuracy greater than 1/50 of the pixel size. However, such accuracy is not easily attained due to various error sources that can affect the image formation process. Current calibration methods typically assume that the observations are unbiased, the only error is the zero-mean independent and identically distributed random noise in the observed image coordinates, and the camera model completely explains the mapping between the 3D coordinates and the image coordinates. In general, these conditions are not met, causing the calibration results to be less accurate than expected. In the paper, a calibration procedure for precise 3D computer vision applications is described. It introduces bias correction for circular control points and a nonrecursive method for reversing the distortion model. The accuracy analysis is presented and the error sources that can reduce the theoretical accuracy are discussed. The tests with synthetic images indicate improvements in the calibration results in limited error conditions. In real images, the suppression of external error sources becomes a prerequisite for successful calibration.

...read moreread less

Journal Article•10.1109/34.862199•

Fast and globally convergent pose estimation from video images

[...]

C. P. Lu, Gregory D. Hager¹, Eric Mjolsness²•Institutions (2)

Johns Hopkins University¹, Jet Propulsion Laboratory²

01 Jun 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: It is shown that the pose estimation problem can be formulated as that of minimizing an error metric based on collinearity in object (as opposed to image) space, and an iterative algorithm which directly computes orthogonal rotation matrices and which is globally convergent is derived.

...read moreread less

Abstract: Determining the rigid transformation relating 2D images to known 3D geometry is a classical problem in photogrammetry and computer vision. Heretofore, the best methods for solving the problem have relied on iterative optimization methods which cannot be proven to converge and/or which do not effectively account for the orthonormal structure of rotation matrices. We show that the pose estimation problem can be formulated as that of minimizing an error metric based on collinearity in object (as opposed to image) space. Using object space collinearity error, we derive an iterative algorithm which directly computes orthogonal rotation matrices and which is globally convergent. Experimentally, we show that the method is computationally efficient, that it is no less accurate than the best currently employed optimization methods, and that it outperforms all tested methods in robustness to outliers.

...read moreread less

Journal Article•10.1109/34.868681•

Robust real-time periodic motion detection, analysis, and applications

[...]

Ross Cutler¹, Larry S. Davis¹•Institutions (1)

University of Maryland, College Park¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: New techniques to detect and analyze periodic motion as seen from both a static and a moving camera are described and the periodicity is analyzed robustly using the 2D lattice structures inherent in similarity matrices.

...read moreread less

Abstract: We describe new techniques to detect and analyze periodic motion as seen from both a static and a moving camera. By tracking objects of interest, we compute an object's self-similarity as it evolves in time. For periodic motion, the self-similarity measure is also periodic and we apply time-frequency analysis to detect and characterize the periodic motion. The periodicity is also analyzed robustly using the 2D lattice structures inherent in similarity matrices. A real-time system has been implemented to track and classify objects using periodicity. Examples of object classification (people, running dogs, vehicles), person counting, and nonstationary periodicity are provided.

...read moreread less

Journal Article•10.1109/34.868686•

Recognition of visual activities and interactions by stochastic parsing

[...]

Yuri A. Ivanov¹, Aaron F. Bobick²•Institutions (2)

Massachusetts Institute of Technology¹, Georgia Institute of Technology²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents and how the system correctly interprets activities of multiple interacting objects is demonstrated.

...read moreread less

Abstract: This paper describes a probabilistic syntactic approach to the detection and recognition of temporally extended activities and interactions between multiple agents. The fundamental idea is to divide the recognition problem into two levels. The lower level detections are performed using standard independent probabilistic event detectors to propose candidate detections of low-level features. The outputs of these detectors provide the input stream for a stochastic context-free grammar parsing mechanism. The grammar and parser provide longer range temporal constraints, disambiguate uncertain low-level detections, and allow the inclusion of a priori knowledge about the structure of temporal events in a given domain. We develop a real-time system and demonstrate the approach in several experiments on gesture recognition and in video surveillance. In the surveillance application, we show how the system correctly interprets activities of multiple interacting objects.

...read moreread less

Journal Article•10.1109/34.877520•

Algorithms for defining visual regions-of-interest: comparison with eye fixations

[...]

C.M. Privitera¹, L.W. Stark¹•Institutions (1)

University of California, Berkeley¹

01 Sep 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper investigates and develops a methodology that serves to automatically identify a subset of aROIs (algorithmically detected ROIs) using different image processing algorithms (IPAs), and appropriate clustering procedures, and compares hROIs with hROI as a criterion for evaluating and selecting bottom-up, context-free algorithms.

...read moreread less

Abstract: Many machine vision applications, such as compression, pictorial database querying, and image understanding, often need to analyze in detail only a representative subset of the image, which may be arranged into sequences of loci called regions-of-interest (ROIs). We have investigated and developed a methodology that serves to automatically identify such a subset of aROIs (algorithmically detected ROIs) using different image processing algorithms (IPAs), and appropriate clustering procedures. In human perception, an internal representation directs top-down, context-dependent sequences of eye movements to fixate on similar sequences of hROIs (human identified ROIs). In the paper, we introduce our methodology and we compare aROIs with hROIs as a criterion for evaluating and selecting bottom-up, context-free algorithms. An application is finally discussed.

...read moreread less

Journal Article•10.1109/34.865184•

A cooperative algorithm for stereo matching and occlusion detection

[...]

C.L. Zitnick¹, Takeo Kanade¹•Institutions (1)

Carnegie Mellon University¹

01 Jul 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Presents a stereo algorithm for obtaining disparity maps with occlusion explicitly detected, and presents the processing results from synthetic and real image pairs, including ones with ground-truth values for quantitative comparison with other methods.

...read moreread less

Abstract: Presents a stereo algorithm for obtaining disparity maps with occlusion explicitly detected. To produce smooth and detailed disparity maps, two assumptions that were originally proposed by Marr and Poggio (1976, 1979) are adopted: uniqueness and continuity. That is, the disparity maps have a unique value per pixel and are continuous almost everywhere. These assumptions are enforced within a three-dimensional array of match values in disparity space. Each match value corresponds to a pixel in an image and a disparity relative to another image. An iterative algorithm updates the match values by diffusing support among neighboring values and inhibiting others along similar lines of sight. By applying the uniqueness assumption, occluded regions can be explicitly identified. To demonstrate the effectiveness of the algorithm, we present the processing results from synthetic and real image pairs, including ones with ground-truth values for quantitative comparison with other methods.

...read moreread less

Journal Article•10.1109/34.824820•

Twenty years of document image analysis in PAMI

[...]

George Nagy¹•Institutions (1)

Rensselaer Polytechnic Institute¹

01 Jan 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The contributions to document image analysis of 99 papers published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) are clustered, summarized, interpolated, interpreted, and evaluated.

...read moreread less

Abstract: The contributions to document image analysis of 99 papers published in the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) are clustered, summarized, interpolated, interpreted, and evaluated.

...read moreread less

Journal Article•10.1109/34.868678•

Monitoring activities from multiple video streams: establishing a common coordinate frame

[...]

L. Lee¹, R. Romano¹, Gideon Stein¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, a planar alignment matrix is used to align the scene's ground plane across multiple views and decompose the alignment matrix to recover the 3D relative camera and ground plane positions.

...read moreread less

Abstract: Monitoring of large sites requires coordination between multiple cameras, which in turn requires methods for relating events between distributed cameras. This paper tackles the problem of automatic external calibration of multiple cameras in an extended scene, that is, full recovery of their 3D relative positions and orientations. Because the cameras are placed far apart, brightness or proximity constraints cannot be used to match static features, so we instead apply planar geometric constraints to moving objects tracked throughout the scene. By robustly matching and fitting tracked objects to a planar model, we align the scene's ground plane across multiple views and decompose the planar alignment matrix to recover the 3D relative camera and ground plane positions. We demonstrate this technique in both a controlled lab setting where we test the effects of errors in the intrinsic camera parameters, and in an uncontrolled, outdoor setting. In the latter, we do not assume synchronized cameras and we show that enforcing geometric constraints enables us to align the tracking data in time. In spite of noise in the intrinsic camera parameters and in the image data, the system successfully transforms multiple views of the scene's ground plane to an overhead view and recovers the relative 3D camera and ground plane positions.

...read moreread less

Journal Article•10.1109/34.841759•

Learning and design of principal curves

[...]

Balázs Kégl¹, Adam Krzyżak², Tamas Linder¹, Kenneth Zeger³•Institutions (3)

Queen's University¹, Concordia University², University of California, San Diego³

01 Mar 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work defines principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution, making it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction.

...read moreread less

Abstract: Principal curves have been defined as "self-consistent" smooth curves which pass through the "middle" of a d-dimensional probability distribution or data cloud. They give a summary of the data and also serve as an efficient feature extraction tool. We take a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition makes it possible to theoretically analyze principal curve learning from training data and it also leads to a new practical construction. Our theoretical learning scheme chooses a curve from a class of polygonal lines with k segments and with a given total length to minimize the average squared distance over n training points drawn independently. Convergence properties of this learning scheme are analyzed and a practical version of this theoretical algorithm is implemented. In each iteration of the algorithm, a new vertex is added to the polygonal line and the positions of the vertices are updated so that they minimize a penalized squared distance criterion. Simulation results demonstrate that the new algorithm compares favorably with previous methods, both in terms of performance and computational complexity, and is more robust to varying data models.

...read moreread less

Journal Article•10.1109/34.868685•

Discovery and segmentation of activities in video

[...]

M. Brand¹, V. Kettnaker²•Institutions (2)

Mitsubishi¹, Rensselaer Polytechnic Institute²

01 Aug 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this article, Hidden Markov Models (HMMs) are used to organize observed activity into meaningful states by minimizing the entropy of the joint distribution of the HMMs' internal state machine.

...read moreread less

Abstract: Hidden Markov models (HMMs) have become the workhorses of the monitoring and event recognition literature because they bring to time-series analysis the utility of density estimation and the convenience of dynamic time warping. Once trained, the internals of these models are considered opaque; there is no effort to interpret the hidden states. We show that by minimizing the entropy of the joint distribution, an HMM's internal state machine can be made to organize observed activity into meaningful states. This has uses in video monitoring and annotation, low bit-rate coding of scene activity, and detection of anomalous behavior. We demonstrate with models of office activity and outdoor traffic, showing how the framework learns principal modes of activity and patterns of activity change. We then show how this framework can be adapted to infer hidden state from extremely ambiguous images, in particular, inferring 3D body orientation and pose from sequences of low-resolution silhouettes.

...read moreread less

Journal Article•10.1109/34.845377•

Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence

[...]

Shai Avidan¹, Amnon Shashua²•Institutions (2)

Microsoft¹, Hebrew University of Jerusalem²

01 Apr 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The problem of reconstructing the 3D coordinates of a moving point seen from a monocular moving camera is considered, i.e., to reconstruct moving objects from line-of-sight measurements only, and the solutions for points moving along a straight-line and along conic-section trajectories are investigated.

...read moreread less

Abstract: We consider the problem of reconstructing the 3D coordinates of a moving point seen from a monocular moving camera, i.e., to reconstruct moving objects from line-of-sight measurements only. The task is feasible only when some constraints are placed on the shape of the trajectory of the moving point. We coin the family of such tasks as "trajectory triangulation." We investigate the solutions for points moving along a straight-line and along conic-section trajectories, We show that if the point is moving along a straight line, then the parameters of the line (and, hence, the 3D position of the point at each time instant) can be uniquely recovered, and by linear methods, from at least five views. For the case of conic-shaped trajectory, we show that generally nine views are sufficient for a unique reconstruction of the moving point and fewer views when the conic is of a known type (like a circle in 3D Euclidean space for which seven views are sufficient). The paradigm of trajectory triangulation, in general, pushes the envelope of processing dynamic scenes forward. Thus static scenes become a particular case of a more general task of reconstructing scenes rich with moving objects (where an object could be a single point).

...read moreread less

Journal Article•10.1109/34.879794•

Mosaicing on adaptive manifolds

[...]

Shmuel Peleg¹, Benny Rousso, A. Rav-Acha¹, Assaf Zomet¹•Institutions (1)

Hebrew University of Jerusalem¹

01 Oct 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A new methodology to allow image mosaicing in more general cases of camera motion is presented, performed by projecting thin strips from the images onto manifolds which are adapted to the camera motion.

...read moreread less

Abstract: Image mosaicing is commonly used to increase the visual field of view by pasting together many images or video frames. Existing mosaicing methods are based on projecting all images onto a predetermined single manifold: A plane is commonly used for a camera translating sideways, a cylinder is used for a panning camera, and a sphere is used for a camera which is both panning and tilting. While different mosaicing methods should therefore be used for different types of camera motion, more general types of camera motion, such as forward motion, are practically impossible for traditional mosaicing. A new methodology to allow image mosaicing in more general cases of camera motion is presented. Mosaicing is performed by projecting thin strips from the images onto manifolds which are adapted to the camera motion. While the limitations of existing mosaicing techniques are a result of using predetermined manifolds, the use of more general manifolds overcomes these limitations.

...read moreread less

Journal Article•10.1109/34.888716•

Mode-finding for mixtures of Gaussian distributions

[...]

Miguel Á. Carreira-Perpiñán¹•Institutions (1)

Georgetown University¹

01 Nov 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Gradient-quadratic and fixed-point iteration algorithms and appropriate values for their control parameters are derived for finding all modes of a Gaussian mixture, a problem with applications in clustering and regression.

...read moreread less

Abstract: Gradient-quadratic and fixed-point iteration algorithms and appropriate values for their control parameters are derived for finding all modes of a Gaussian mixture, a problem with applications in clustering and regression. The significance of the modes found is quantified locally by Hessian-based error bars and globally by the entropy as sparseness measure.

...read moreread less

Journal Article•10.1109/34.857006•

Supervised learning of large perceptual organization: graph spectral partitioning and learning automata

[...]

Sudeep Sarkar¹, Padmanabhan Soundararajan²•Institutions (2)

University of Florida¹, University of South Florida²

01 May 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A novel strategy to adapt this grouping process to objects in a domain and the significant role of photometric attributes in grouping and the ability to form large salient groups from a set of local relations, each defined over a small number of primitives are offered.

...read moreread less

Abstract: Perceptual organization offers an elegant framework to group low-level features that are likely to come from a single object. We offer a novel strategy to adapt this grouping process to objects in a domain. Given a set of training images of objects in context, the associated learning process decides on the relative importance of the basic salient relationships such as proximity, parallelness, continuity, junctions, and common region toward segregating the objects from the background. The parameters of the grouping process are cast as probabilistic specifications of Bayesian networks that need to be learned. This learning is accomplished using a team of stochastic automata in an N-player cooperative game framework. The grouping process, which is based on graph partitioning is able to form large groups from relationships defined over a small set of primitives and is fast. We statistically demonstrate the robust performance of the grouping and the learning frameworks on a variety of real images. Among the interesting conclusions is the significant role of photometric attributes in grouping and the ability to form large salient groups from a set of local relations, each defined over a small number of primitives.

...read moreread less

Journal Article•10.1109/34.888714•

On the fitting of surfaces to data with covariances

[...]

Wojciech Chojnacki¹, Michael J. Brooks¹, A. van den Hengel¹, Darren Gawley¹•Institutions (1)

University of Adelaide¹

01 Nov 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work considers the problem of estimating parameters of a model described by an equation of special form, and generates a Newton-like iterative scheme that has as its theoretical limit the minimizer of the cost function.

...read moreread less

Abstract: We consider the problem of estimating parameters of a model described by an equation of special form. Specific models arise in the analysis of a wide class of computer vision problems, including conic fitting and estimation of the fundamental matrix. We assume that noisy data are accompanied by (known) covariance matrices characterizing the uncertainty of the measurements. A cost function is first obtained by considering a maximum-likelihood formulation and applying certain necessary approximations that render the problem tractable. A Newton-like iterative scheme is then generated for determining a minimizer of the cost function. Unlike alternative approaches such as Sampson's method or the renormalization technique, the new scheme has as its theoretical limit the minimizer of the cost function. Furthermore, the scheme is simply expressed, efficient, and unsurpassed as a general technique in our testing. An important feature of the method is that it can serve as a basis for conducting theoretical comparison of various estimation approaches.

...read moreread less

Journal Article•10.1109/34.862200•

Fractional-step dimensionality reduction

[...]

Rohit M. Lotlikar¹, Ravi Kothari¹•Institutions (1)

University of Cincinnati¹

01 Jun 2000-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The concept of fractional dimensionality is introduced and an incremental procedure, called the fractional-step LDA (F-LDA), is developed to reduce the dimensionality in fractional steps.

...read moreread less

Abstract: Linear projections for dimensionality reduction, computed using linear discriminant analysis (LDA), are commonly based on optimization of certain separability criteria in the output space. The resulting optimization problem is linear, but these separability criteria are not directly related to the classification accuracy in the output space. Consequently, a trial and error procedure has to be invoked, experimenting with different separability criteria that differ in the weighting function used and selecting the one that performed best on the training set. Often, even the best weighting function among the trial choices results in poor classification of data in the subspace. In this short paper, we introduce the concept of fractional dimensionality and develop an incremental procedure, called the fractional-step LDA (F-LDA) to reduce the dimensionality in fractional steps. The F-LDA algorithm is more robust to the selection of weighting function and for any given weighting function, it finds a subspace in which the classification accuracy is higher than that obtained using LDA.

...read moreread less

...

Expand