TL;DR: This work forms a novel time-constrained active classification problem and proposes solution algorithms that employ a variation of Monte Carlo tree search to plan non-myopically, and indicates that the non- myopic approach outperforms both passive and myopic strategies.
Abstract: Classifying objects in complex unknown environments is a challenging problem in robotics and is fundamental in many applications. Modern sensors and sophisticated perception algorithms extract rich 3D textured information, but are limited to the data that are collected from a given location or path. We are interested in closing the loop around perception and planning, in particular to plan paths for better perceptual data, and focus on the problem of planning scanning sequences to improve object classification from range data. We formulate a novel time-constrained active classification problem and propose solution algorithms that employ a variation of Monte Carlo tree search to plan non-myopically. Our algorithms use a particle filter combined with Gaussian process regression to estimate joint distributions of object class and pose. This estimator is used in planning to generate a probabilistic belief about the state of objects in a scene, and also to generate beliefs for predicted sensor observations from future viewpoints. These predictions consider occlusions arising from predicted object positions and shapes. We evaluate our algorithms in simulation, in comparison to passive and greedy strategies. We also describe similar experiments where the algorithms are implemented online, using a mobile ground robot in a farm environment. Results indicate that our non-myopic approach outperforms both passive and myopic strategies, and clearly show the benefit of active perception for outdoor object classification.
TL;DR: A novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism is proposed, which outperforms competing methods in both average trajectory length and success rate.
Abstract: We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs. While scene-driven visual navigation has been widely studied, prior efforts on learning navigation policies for robots to find objects are limited. The problem is often more challenging than target scene finding as the target objects can be very small in the view and can be in an arbitrary pose. We approach the problem from an active perceiver perspective, and propose a novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism. To validate our method, we conduct experiments on both a simulation dataset (AI2-THOR)and a real-world environment with a physical robot. We further propose a new decaying reward function to learn the control policy specific to the object searching task. Experimental results validate the efficacy of our method, which outperforms competing methods in both average trajectory length and success rate.
TL;DR: Zhang et al. as discussed by the authors proposed an end-to-end solution via deep reinforcement learning, where a ConvNet-LSTM function approximator is adopted for the direct frame-toaction prediction.
Abstract: We study active object tracking, where a tracker takes as input the visual observation (i.e., frame sequence) and produces the camera control signal (e.g., move forward, turn left, etc.). Conventional methods tackle the tracking and the camera control separately, which is challenging to tune jointly. It also incurs many human efforts for labeling and many expensive trial-and-errors in realworld. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning, where a ConvNet-LSTM function approximator is adopted for the direct frame-toaction prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for a successful training. The tracker trained in simulators (ViZDoom, Unreal Engine) shows good generalization in the case of unseen object moving path, unseen object appearance, unseen background, and distracting object. It can restore tracking when occasionally losing the target. With the experiments over the VOT dataset, we also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios.
TL;DR: In this article, an end-to-end solution via deep reinforcement learning is proposed, where a tracker takes visual observations (i.e., frame sequences) as input and produces corresponding camera control signals as output (e.g., move forward, turn left, etc.).
Abstract: We study active object tracking, where a tracker takes visual observations (i.e., frame sequences) as input and produces the corresponding camera control signals as output (e.g., move forward, turn left, etc.). Conventional methods tackle tracking and camera control tasks separately, and the resulting system is difficult to tune jointly. These methods also require significant human efforts for image labeling and expensive trial-and-error system tuning in the real world. To address these issues, we propose, in this paper, an end-to-end solution via deep reinforcement learning. A ConvNet-LSTM function approximator is adopted for the direct frame-to-action prediction. We further propose an environment augmentation technique and a customized reward function, which are crucial for successful training. The tracker trained in simulators (ViZDoom and Unreal Engine) demonstrates good generalization behaviors in the case of unseen object moving paths, unseen object appearances, unseen backgrounds, and distracting objects. The system is robust and can restore tracking after occasional lost of the target being tracked. We also find that the tracking ability, obtained solely from simulators, can potentially transfer to real-world scenarios. We demonstrate successful examples of such transfer, via experiments over the VOT dataset and the deployment of a real-world robot using the proposed active tracker trained in simulation.
TL;DR: A deep reinforcement learning architecture is developed to solve the active object detection problem as a sequential action-decision process, and a double deep Q-learning network is applied to predict an action at each step.
Abstract: Visual object detection is one of the fundamental tasks in computer vision and robotics. Small scale, partial capture and occlusion often occur in robotic applications, most existing object detection algorithms perform poorly in such situations. While a robot can look at one object from different views and plan its trajectory in the next few steps, which can lead to better observations. We formulate it as a sequential action-decision process, and develop a deep reinforcement learning architecture to solve the active object detection problem. A double deep Q-learning network (DQN) is applied to predict an action at each step. Experimental validation on the Active Vision Dataset shows the efficiency of the proposed method.
TL;DR: This work proposes active object languages as a development tool for formal system models of distributed systems and uses the formal active object language ABS which comes with an extensive tool set to support rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs.
Abstract: We propose active object languages as a development tool for formal system models of distributed systems. Additionally to a formalization based on a term rewriting system, we use established Software Engineering concepts, including software product lines and object orientation that come with extensive tool support. We illustrate our modeling approach by prototyping a weak memory model. The resulting executable model is modular and has clear interfaces between communicating participants through object-oriented modeling. Relaxations of the basic memory model are expressed as self-contained variants of a software product line. As a modeling language we use the formal active object language ABS which comes with an extensive tool set. This permits rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs. Hence, our approach supports the prototyping of formal system models with early feedback.
TL;DR: This work uses the Hough framework to fuse optical and orientation information of the different views of the objects, and applies active vision, where the direction of the next view is determined to increase the hit-rate of retrieval.
Abstract: In the last few years, there has been a steadily growing interest in autonomous vehicles and robotic systems. While many of these agents are expected to have limited resources, these systems should be able to dynamically interact with other objects in their environment. We present an approach where lightweight sensory and processing techniques, requiring very limited memory and processing power, can be successfully applied to the task of object retrieval using sensors of different modalities. We use the Hough framework to fuse optical and orientation information of the different views of the objects. In the presented spatio-temporal perception technique, we apply active vision, where, based on the analysis of initial measurements, the direction of the next view is determined to increase the hit-rate of retrieval. The performance of the proposed methods is shown on three datasets loaded with heavy noise.
TL;DR: In this paper, an object detection method based on depth reinforcement learning is presented, which belongs to the mode identification technical field and active object detection technical field; the method consists of the following steps: firstly building a depth RL nerve network; carrying out multi object detection tests for a robot, obtaining training data to train the nerve network,thus obtaining a trained nerve network and in the usage phase, the robot obtains a present time image and an envelope box of a to-be-detected object in the image, and inputs same into the trained nervenetwork;
Abstract: The invention provides an object detection method based on depth reinforcement learning, and belongs to the mode identification technical field and active object detection technical field; the methodcomprises the following steps: firstly building a depth reinforcement learning nerve network; carrying out multi object detection tests for a robot, obtaining training data to train the nerve network,thus obtaining a trained nerve network; in the usage phase, the robot obtains a present time image and an envelope box of a to-be-detected object in the image, and inputs same into the trained nervenetwork; the network outputs a motion to-be-executed by the robot in the next moment; the robot executes the motion to obtain a new present time envelope box, and uses an identification function for determination; if the identification credibility of the to-be-detected object in the envelope box is higher than a set identification threshold, the object detection succeeds. The method uses the reinforcement learning technology to control the robot motions, and uses the robot visual angle changes to obtain better observation images, thus obtaining a better object detection result.
TL;DR: In this paper, an intrusion detection method on an active object-based storage system is presented. But the method consists of utilizing an auditing function of the active object based storage system to monitor applications, obtaining auditing log information and acquiring system call information called by the applications according to the auditing logs, and carrying out parallel intrusion detection on the provenance information, thereby acquiring a detection result.
Abstract: The invention discloses an intrusion detection method on an active object-based storage system and belongs to the field of computer network security. The method comprises the steps of utilizing an auditing function of the active object-based storage system to monitor applications, obtaining auditing log information and acquiring system call information called by the applications according to the auditing log information; acquiring provenance information of the applications according to the system call information; and carrying out parallel intrusion detection on the provenance information, thereby acquiring a detection result, wherein when the detection is abnormal, it explains that the active object-based storage system is intruded, and when the detection result is normal, it explains that the active object-based storage system is secure. According to the method, the provenance information is collected on the active object-based storage system, so the intrusion detection efficiency ishigh, and the intrusion detection accuracy is relatively high.
TL;DR: An active vision object detection system is proposed on a robotic environment that uses a 3D camera mounted on the robot head and an RGB camera on its hand to detect and recognize objects being seen from the head camera, while computing a confidence score on the classification.
Abstract: Employing multiple sensing capabilities in a robotic platform offers significant advantages in increasing the recognition abilities of robots. Specifically, for vision-based object detection in a real-world environment, acquiring information from different viewpoints might be decisive for correct classifications in the presence of occlusions or to disambiguate between similar objects. For this reason, an active vision object detection system is proposed in this paper. It is implemented on a robotic environment that uses a 3D camera mounted on the robot head and an RGB camera on its hand. The system tries to detect and recognize objects being seen from the head camera, while computing a confidence score on the classification. In the case of an unreliable classification, another stage of object recognition is dynamically requested, but this time from the viewpoint of the hand camera. The objects detected from the two cameras are matched and their classification decisions are fused through a novel fusion approach based on the Dempster-Shafer evidence theory. Experimental results show sizable improvements in object recognition performance compared to a traditional singlecamera configuration, as well as applicability to handling partial occlusions.
TL;DR: In this paper, the LARVA runtime verification tool is combined with the active object framework PROACTIVE for distributed monitoring of distributed, object-oriented applications, where monitoring mechanisms are automatically generated from property specifications, to check compliance at runtime.
Abstract: Since distributed software systems are ubiquitous, their correct functioning is crucially important. Static verification is possible in principle, but requires high expertise and effort which is not feasible in many eco-systems. Runtime verification can serve as a lean alternative, where monitoring mechanisms are automatically generated from property specifications, to check compliance at runtime. This paper contributes a practical solution for powerful and flexible runtime verification of distributed, object-oriented applications, via a combination of the runtime verification tool LARVA and the active object framework PROACTIVE. Even if LARVA supports in itself only the generation of local, sequential monitors, we empower LARVA for distributed monitoring by connecting monitors with active objects, turning them into active, communicating monitors. We discuss how this allows for a variety of monitoring architectures. Further, we show how property specifications, and thereby the generated monitors, provide a model that splits the blame between the local object and its environment. While LARVA itself focuses on monitoring of control-oriented properties, we use the LARVA front-end STARVOORS to also capture data-oriented (pre/post) properties in the distributed monitoring. We demonstrate this approach to distributed runtime verification with a case study, a distributed key/value store.
TL;DR: An active object-carrying assistant based on password verification, comprising a box (1) for accommodating sundries and a guiding structure (2) for a user to wear, is described in this article.
Abstract: An active object-carrying assistant based on password verification, comprising a box (1) for accommodating sundries and a guiding structure (2) for a user to wear A driving device (11) for driving a movement is provided at a lower end of the box (1) Before using the active object-carrying assistant based on password verification, a password of the user is pre-stored, and when using the active object-carrying assistant, the user verifies the password by means of a verification module (21), and if the verification succeeds, the starting of the driving device (11) can be controlled, and if the verification fails, the driving device is not started After the driving device (11) is started, the user can put the sundries needing to be transported into the box (1) and then wears a guiding structure (2) on their body When the user moves in various rooms, the guiding structure (2) pulls a hauling rope (13) to control the direction of travel of the driving device (11), and the box (1) can follow behind the user and shift, along with the user, between various rooms, so as to assist with object carrying
TL;DR: Active object languages as discussed by the authors have been used as a development tool for formal system models of distributed systems, including software product lines and object orientation, with extensive tool support, including a term rewriting system.
Abstract: We propose active object languages as a development tool for formal system models of distributed systems. Additionally to a formalization based on a term rewriting system, we use established Software Engineering concepts, including software product lines and object orientation that come with extensive tool support. We illustrate our modeling approach by prototyping a weak memory model. The resulting executable model is modular and has clear interfaces between communicating participants through object-oriented modeling. Relaxations of the basic memory model are expressed as self-contained variants of a software product line. As a modeling language we use the formal active object language ABS which comes with an extensive tool set. This permits rapid formalization of core ideas, early validity checks in terms of formal invariant proofs, and debugging support by executing test runs. Hence, our approach supports the prototyping of formal system models with early feedback.
TL;DR: Zhang et al. as mentioned in this paper proposed a novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism to learn a navigation policy for a robot to actively search for an object of interest.
Abstract: We study the problem of learning a navigation policy for a robot to actively search for an object of interest in an indoor environment solely from its visual inputs. While scene-driven visual navigation has been widely studied, prior efforts on learning navigation policies for robots to find objects are limited. The problem is often more challenging than target scene finding as the target objects can be very small in the view and can be in an arbitrary pose. We approach the problem from an active perceiver perspective, and propose a novel framework that integrates a deep neural network based object recognition module and a deep reinforcement learning based action prediction mechanism. To validate our method, we conduct experiments on both a simulation dataset (AI2-THOR) and a real-world environment with a physical robot. We further propose a new decaying reward function to learn the control policy specific to the object searching task. Experimental results validate the efficacy of our method, which outperforms competing methods in both average trajectory length and success rate.