TL;DR: An unsupervised technique for visual learning is presented, which is based on density estimation in high-dimensional spaces using an eigenspace decomposition and is applied to the probabilistic visual modeling, detection, recognition, and coding of human faces and nonrigid objects.
Abstract: We present an unsupervised technique for visual learning, which is based on density estimation in high-dimensional spaces using an eigenspace decomposition. Two types of density estimates are derived for modeling the training data: a multivariate Gaussian (for unimodal distributions) and a mixture-of-Gaussians model (for multimodal distributions). Those probability densities are then used to formulate a maximum-likelihood estimation framework for visual search and target detection for automatic object recognition and coding. Our learning technique is applied to the probabilistic visual modeling, detection, recognition, and coding of human faces and nonrigid objects, such as hands.
TL;DR: SOHO as discussed by the authors learns to extract comprehensive yet compact image features through a visual dictionary (VD) that facilitates cross-modal understanding by taking a whole image as input, and learns vision-language representation in an end-to-end manner.
Abstract: We study joint learning of Convolutional Neural Network (CNN) and Transformer for vision-language pre-training (VLPT) which aims to learn cross-modal alignments from millions of image-text pairs. State-of-the-art approaches extract salient image regions and align regions with words step-by-step. As region-based visual features usually represent parts of an image, it is challenging for existing vision-language models to fully understand the semantics from paired natural languages. In this paper, we propose SOHO to "Seeing Out of tHe bOx" that takes a whole image as input, and learns vision-language representation in an end-to-end manner. SOHO does not require bounding box annotations which enables inference 10 times faster than region-based approaches. In particular, SOHO learns to extract comprehensive yet compact image features through a visual dictionary (VD) that facilitates cross-modal understanding. VD is designed to represent consistent visual abstractions of similar semantics. It is updated on-the-fly and utilized in our proposed pre-training task Masked Visual Modeling (MVM). We conduct experiments on four well-established vision-language tasks by following standard VLPT settings. In particular, SOHO achieves absolute gains of 2.0% R@1 score on MSCOCO text retrieval 5k test split, 1.5% accuracy on NLVR2 test-P split, 6.7% accuracy on SNLI-VE test split, respectively.
TL;DR: In this article, the authors describe an integrated computer-based graphical interface, methods and systems providing a shell environment for development and deployment for graphic information storage and retrieval, visual modeling and dynamic simulation of complex systems.
Abstract: The present invention describes an integrated computer-based graphical interface, methods and systems providing a shell environment for development and deployment for graphic information storage and retrieval, visual modeling and dynamic simulation of complex systems. In the current implementation the system comprises libraries of knowledge-based building-blocks that include sets of icons representing chemical processes, the pools of entities that participate in those processes, and the graphical description of those entities, encapsulating both information and mathematical models within the modular components, in the form of tables and in the form of component icons, and a plurality of methods are associated with each of the icons. The models are built by interconnecting each pool to one or several processes, and each process to one or several pools, resulting in complex networks of multidimensional pathways. A number of functions and graphical interfaces can be selected from the menus associated with each icon, to extract in various forms the information contained in the models build with those building blocks. Those functions include the creation of interactive networks of pathways, graphic selection of complex predefined queries based on the relative position of pools of entities in the pathways, the role that the pools play in the processes, and the structural components of the entities of those pools, and quantitative simulations. The system integrates inferential control with quantitative and semi-quantitative simulation methods, and provides a variety of alternatives to deal with complex dynamic systems and with incomplete and constantly evolving information and data.
TL;DR: TinkerCell as mentioned in this paper is a visual modeling tool that supports a hierarchy of biological parts, each part consists of a set of attributes that define the part, such as sequence or rate constants.
Abstract: Synthetic biology brings together concepts and techniques from engineering and biology. In this field, computer-aided design (CAD) is necessary in order to bridge the gap between computational modeling and biological data. Using a CAD application, it would be possible to construct models using available biological "parts" and directly generate the DNA sequence that represents the model, thus increasing the efficiency of design and construction of synthetic networks. An application named TinkerCell has been developed in order to serve as a CAD tool for synthetic biology. TinkerCell is a visual modeling tool that supports a hierarchy of biological parts. Each part in this hierarchy consists of a set of attributes that define the part, such as sequence or rate constants. Models that are constructed using these parts can be analyzed using various third-party C and Python programs that are hosted by TinkerCell via an extensive C and Python application programming interface (API). TinkerCell supports the notion of a module, which are networks with interfaces. Such modules can be connected to each other, forming larger modular networks. TinkerCell is a free and open-source project under the Berkeley Software Distribution license. Downloads, documentation, and tutorials are available at http://www.tinkercell.com
. An ideal CAD application for engineering biological systems would provide features such as: building and simulating networks, analyzing robustness of networks, and searching databases for components that meet the design criteria. At the current state of synthetic biology, there are no established methods for measuring robustness or identifying components that fit a design. The same is true for databases of biological parts. TinkerCell's flexible modeling framework allows it to cope with changes in the field. Such changes may involve the way parts are characterized or the way synthetic networks are modeled and analyzed computationally. TinkerCell can readily accept third-party algorithms, allowing it to serve as a platform for testing different methods relevant to synthetic biology.
TL;DR: It is shown how classical Web modeling concepts are not enough to capture the specificity of RIAs, an existing Web modeling language is extended, and an implementation of a CASE tool for visual modeling and code generation from RIA-aware specifications is provided.
Abstract: This paper addresses conceptual modeling and automaticcode generation for Rich Internet Applications, a variant ofWeb-based systems bridging desktop and thin-client Webinterfaces. We show how classical Web modeling conceptsare not enough to capture the specificity of RIAs, extend anexisting Web modeling language, and provide an implementationof a CASE tool for visual modeling and code generationfrom RIA-aware specifications. Experimentation of theproposed approach in real-world scenarios is also reported.