Proceedings Article10.1109/HSI.2018.8431232
A Vision and Speech Enabled, Customizable, Virtual Assistant for Smart Environments
Giancarlo Iannizzotto,Lucia Lo Bello,Andrea Nucita,Giorgio Mario Grasso +3 more
- 04 Jul 2018
- pp 50-56
65
TL;DR: Some of the most advanced techniques in computer vision, deep learning, speech generation and recognition, and artificial intelligence are combined into a virtual assistant architecture for smart home automation systems, which is effective and resource-efficient, interactive and customizable.
read more
Abstract: Recent developments in smart assistants and smart home automation are lately attracting the interest and curiosity of consumers and researchers. Speech enabled virtual assistants (often named smart speakers) offer a wide variety of network-oriented services and, in some cases, can connect to smart environments, thus enhancing them with new and effective user interfaces. However, such devices also reveal new needs and some weaknesses. In particular, they represent faceless and blind assistants, unable to show a face, and therefore an emotion, and unable to ‘see’ the user. As a consequence, the interaction is impaired and, in some cases, ineffective. Moreover, most of those devices heavily rely on cloud-based services, thus transmitting potentially sensitive data to remote servers. To overcome such issues, in this paper we combine some of the most advanced techniques in computer vision, deep learning, speech generation and recognition, and artificial intelligence, into a virtual assistant architecture for smart home automation systems. The proposed assistant is effective and resource-efficient, interactive and customizable, and the realized prototype runs on a low-cost, small-sized, Raspberry PI 3 device. For testing purposes, the system was integrated with an open source home automation environment and ran for several days, while people were encouraged to interact with it, and proved to be accurate, reliable and appealing.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep Learning Application Pros And Cons Over Algorithm
TL;DR: This review provides a general overview of a new concept and the growing benefits and popularity of deep learning, which can help researchers and students interested in deep learning methods.
Voice Assistant Integrated with Chat GPT
TL;DR: Farcana as discussed by the authors combines the functionality of the GPT chatbot and a voice assistant and offers players a new approach to familiarize themselves with game mechanics and general account management.
53
Effects of Patient Care Assistant Embodiment and Computer Mediation on User Experience
Kangsoo Kim,Nahal Norouzi,Tiffany Losekamp,Gerd Bruder,Mindi Anderson,Gregory F. Welch +5 more
- 01 Dec 2019
TL;DR: The results show that, as expected, a real caregiver provides the optimal user experience but an embodied virtual assistant is also a viable option for patient care environments, providing significantly higher social presence and engagement than voice-only interaction.
32
Priority-Based Bandwidth Management in Virtualized Software-Defined Networks
TL;DR: This work presents the PrioSDN Resource Manager (PrioSDN_RM), a resource management mechanism based on admission control for virtualized SDN-based networks that exploits a priority-based runtime bandwidth distribution mechanism to dynamically react to load changes (e.g., due to alarms).
27
A Perspective on Ethernet in Automotive Communications—Current Status and Future Trends
TL;DR: In this article , the authors provide an overview of Ethernet-based in-car networking and discuss novel trends and future developments in automotive communications, as well as discuss the potential of Ethernet for automotive communications.
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
Robust Real-Time Face Detection
Paul A. Viola,Michael Jones +1 more
TL;DR: In this paper, a face detection framework that is capable of processing images extremely rapidly while achieving high detection rates is described. But the detection performance is limited to 15 frames per second.
14.6K
FaceNet: A Unified Embedding for Face Recognition and Clustering
TL;DR: FaceNet as discussed by the authors uses a deep convolutional network trained to directly optimize the embedding itself, rather than an intermediate bottleneck layer as in previous deep learning approaches, and achieves state-of-the-art face recognition performance using only 128 bytes per face.
14.2K
Robust real-time face detection
Paul A. Viola,Michael Jones +1 more
- 07 Jul 2001
TL;DR: A new image representation called the “Integral Image” is introduced which allows the features used by the detector to be computed very quickly and a method for combining classifiers in a “cascade” which allows background regions of the image to be quickly discarded while spending more computation on promising face-like regions.