1. What are the drawbacks of current detection technology?
The current state of detection technology has several drawbacks that prevent it from working with today's widely available infrastructure. Inaccuracies may occur when a person reviews CCTV footage, which is a major weakness of conventional surveillance systems. The need for a watchful supervisor to review footage and ensure that any unusual activity is properly detected and addressed is a major weakness. The proposal aims to eliminate the need for extra guidance, reduce human input and labor, and instantly recognize the type of crime taking place, noting the people involved, and taking immediate steps to start mitigation strategies at the crime scene.
read more
2. What models were used for detecting firearms in photographs?
In order to detect firearms from photographs, researchers trained classifier models using VGGNet 19 as the pre-trained model. The results showed an accuracy of 69% and a recall of 75%.
read more
3. How does the proposed method reduce manual intervention?
The proposed method reduces manual intervention by instantly recognizing the type of crime, identifying people involved, and initiating immediate actions to address the crime scene. It eliminates the need for an attentive supervisor to review CCTV footage, reducing labor and potential mistakes. The method utilizes deep learning techniques and a DNN module to train a model for facial recognition, allowing for efficient identification of individuals in CCTV feeds. By automating the process, the proposed method enhances surveillance system effectiveness and efficiency.
read more
4. What is the process of face detection?
Face detection involves identifying and returning the position of a face within a picture or video. It is the first step in face verification, where the image of the face being presented is checked against a database to determine if it matches any existing face. Distance metrics like L2 norm or cosine similarity are used to measure the similarity between two faces. This process is crucial for face recognition, as it extracts salient facial features and assigns them to labels from the training dataset. In the provided section, the pipeline for face recognition includes face detection, feature extraction, and training a Support Vector Machine (SVM) on the extracted embeddings. Caffe and Open Face Models are used for face detection and feature extraction, respectively. The Single Shot Detector (SSD) architecture and ResNet are employed for deep learning face detection. The process involves discretizing the image into boxes with high confidence feature maps and adjusting their sizes for optimal detection. The final bounding boxes are shown in Figure 3. Additionally, the dlib library is used for face alignment by identifying facial markers. The neural network uses triplet loss to calculate face embeddings and fine-tune weights, resulting in distinct embeddings for different faces. This enables the training of a classifier on top of the computed face embedding, such as Random Forests, SGD Classifiers, SVMs, and more.
read more