About: Vision processing unit is a research topic. Over the lifetime, 28 publications have been published within this topic receiving 188 citations. The topic is also known as: VPU.
TL;DR: The vision processing unit incorporates parallelism, instruction set architecture, and microarchitectural features to provide highly sustainable performance efficiency across a range of computational-Imaging and computer vision applications, including those with low latency requirements on the order of milliseconds.
Abstract: Myriad 2 is a multicore, always-on system on chip that supports computational imaging and visual awareness for mobile, wearable, and embedded applications. The vision processing unit incorporates parallelism, instruction set architecture, and microarchitectural features to provide highly sustainable performance efficiency across a range of computationalImaging and computer vision applications, including those with low latency requirements on the order of milliseconds.
TL;DR: This study presents the analysis and implementation of different techniques based on the use of an additional hardware element as is the case of a Vision Processing Unit (VPU) in combination with methods that affect the resolution, bit rate, and time of video processing.
Abstract: The actual number of surveillance cameras and the different methods for counting vehicles originate the question: What is the best place to process video flows? This work performs the implementation of a counting system for mobility actors like cars, pedestrians, motorcycles, bicycles, buses, and trucks in the context of an Edge computing application using deep learning. However, the implementation of Deep Neural Networks for Object Detection in low-capacity embedded devices make it difficult to perform tasks that require high processing or must be carried out in real time. To solve this problem this study presents the analysis and implementation of different techniques based on the use of an additional hardware element as is the case of a Vision Processing Unit (VPU) in combination with methods that affect the resolution, bit rate, and time of video processing. For this purpose we consider the Mobilenet-SSD model with two approaches: a pre-trained model with known data sets and a trained model with images from our specific scenarios. The use of SSD-Mobilenet’s model generates different results in terms of accuracy and time of video processing in the system. Results show that the use of an embedded device in combination with a VPU and video processing techniques reach 18.62 Frames per Second (FPS). Thus, video processing time is slightly superior (5.63 minutes) for a video of 5 minutes. Recall and precision values of 91% and 97% are reported in the best case (class car) for the vehicle counting system.
TL;DR: This work considers the integration of co-processors in high-performance computing (HPC) to enable low-power, seamless computation offloading of certain operations, and explores the so-called Vision Processing Unit (VPU), a highly-parallel vector processor with a power envelope of less than 1W.
Abstract: The success of the exascale supercomputer is largely debated to remain dependent on novel breakthroughs in technology that effectively reduce the power consumption and thermal dissipation requirements. In this work, we consider the integration of co-processors in high-performance computing (HPC) to enable low-power, seamless computation offloading of certain operations. In particular, we explore the so-called Vision Processing Unit (VPU), a highly-parallel vector processor with a power envelope of less than 1W. We evaluate this chip during inference using a pre-trained GoogLeNet convolutional network model and a large image dataset from the ImageNet ILSVRC challenge. Preliminary results indicate that a multi-VPU configuration provides similar performance compared to reference CPU and GPU implementations, while reducing the thermal-design power (TDP) up to 8x in comparison.
TL;DR: In this paper, a system and a method for detecting a face was proposed, which includes a vision processing unit and a face detection unit, which calculates distance information using a plurality of images including a face pattern, and discriminates between a foreground image including the face patterns and a background image not including the patterns.
Abstract: A system and a method for detecting a face are provided. The system includes a vision processing unit and a face detection unit. The vision processing unit calculates distance information using a plurality of images including a face pattern, and discriminates between a foreground image including the face pattern and a background image not including the face pattern, using the distance information. The face detection unit scales the foreground image according to the distance information, and detects the face pattern from the scaled foreground image.
TL;DR: This paper focuses on edge deployments to make the smart queuing system (SQS) accessible by all also providing ability to run it on cheap devices, thus considerably reducing the cost of deployment of such a system.
Abstract: Recent increases in computational power and the development of specialized architecture led to the possibility to perform machine learning, especially inference, on the edge. OpenVINO is a toolkit based on convolutional neural networks that facilitates fast-track development of computer vision algorithms and deep learning neural networks into vision applications, and enables their easy heterogeneous execution across hardware platforms. A smart queue management can be the key to the success of any sector. In this paper, we focus on edge deployments to make the smart queuing system (SQS) accessible by all also providing ability to run it on cheap devices. This gives it the ability to run the queuing system deep learning algorithms on pre-existing computers which a retail store, public transportation facility or a factory may already possess, thus considerably reducing the cost of deployment of such a system. SQS demonstrates how to create a video AI solution on the edge. We validate our results by testing it on multiple edge devices, namely CPU, integrated edge graphic processing unit (iGPU), vision processing unit (VPU) and field-programmable gate arrays (FPGAs). Experimental results show that deploying a SQS on edge is very promising.