1. What are the visual features used in image retrieval?
Visual features used in image retrieval include shape-based features like edges or moment invariants, color-based features using a histogram of pixel values, and key-point-based features such as SIFT and SURF. Recent studies have also focused on convolutional neural networks (CNN) for image matching and retrieval, where CNN-based features are obtained by training the model for image classification and then modifying it for feature extraction. These features represent shapes, color distribution, and other visual aspects of the image, aiding in the retrieval of similar images from a database.
read more
2. What is the proposed method for extracting features from product images?
The proposed method involves training CNN models with squared-hinge loss as an alternative to softmax loss for feature extraction. The extracted image features are then indexed using the nearest-neighbour (NN) indexing technique. This method aims to improve content-based product image retrieval by providing an alternative to existing CNN-based feature extraction methods. The study evaluates different CNN models, training parameters, and loss functions to determine the best configuration for achieving optimal results. The extracted features and the NN indexing technique can be applied to content-based retrieval in e-commerce shops, making it a valuable contribution to the field.
read more
3. What CNN models are used for feature extraction?
Various CNN models are used for feature extraction in research. For example, in [11] and [12], the FC6 and FC7 layers of the AlexNet model are utilized. Razavian et al. [14] applied the OverFeat model, extracting features from the first FC layer (layer 22). The HybridNet model [16] was used in [17] to extract features using the activation of the first FC layer (FC6). These models are applied for general image classification and retrieval, as well as specific image applications like product images. [18] used a self-built network model, while [19] applied CNN features from the VGG-19 model for fashion product image retrieval. Elleuch et al. [20] used features from the Inception V3 model's bottleneck layer on a clothing dataset.
read more
4. What are the three steps in the method of this study?
The method in this study consists of three steps. First, transfer learning is applied to a pre-trained CNN model. Second, the model is trained with images and category labels using squared-hinge loss. Finally, image features are extracted and indexed using NN indexing technique. These steps result in a fine-tuned CNN model for content-based product image retrieval. Figure 2 illustrates the proposed method for this process.
read more