TL;DR: In this article, the authors propose a decoupled two-stage counting (D2C) framework that sequentially regresses the probability map and learns a counter conditioned on the probabilistic intermediate representation.
Abstract: One of appealing approaches to counting dense objects, such as crowd, is density map estimation Density maps, however, present ambiguous appearance cues in congested scenes, rendering infeasibility in identifying individuals and difficulties in diagnosing errors Inspired by an observation that counting can be interpreted as a two-stage process, ie , identifying possible object regions and counting exact object numbers, we introduce a probabilistic intermediate representation termed the probability map that depicts the probability of each pixel being an object This representation allows us to decouple counting into probability map regression (PMR) and count map regression (CMR) We therefore propose a novel decoupled two-stage counting (D2C) framework that sequentially regresses the probability map and learns a counter conditioned on the probability map Given the probability map and the count map, a peak point detection algorithm is derived to localize each object with a point under the guidance of local counts An advantage of D2C is that the counter can be learned reliably with additional synthesized probability maps This addresses important data deficiency and sample imbalanced problems in counting Our framework also enables easy diagnoses and analyses of error patterns For instance, we find that, the counter per se is sufficiently accurate, while the bottleneck appears to be PMR We further instantiate a network D2CNet in our framework and report state-of-the-art counting and localization performance across 6 crowd counting benchmarks Since the probability map is a representation independent of visual appearance, D2CNet also exhibits remarkable cross-dataset transferability Code and pretrained models are made available at: https://gitio/d2cnet
TL;DR: A simple noncoherent detection method for chaos-shift-keying (CSK) modulation is proposed, exploiting some distinguishable property of chaotic maps for recovering the digital message.
Abstract: Chaos-based communications can be applied advantageously if the property of chaotic systems is suitably exploited. In this Letter a simple noncoherent detection method for chaos-shift-keying (CSK) modulation is proposed, exploiting some distinguishable property of chaotic maps for recovering the digital message. Specifically, the proposed method exploits the difference in the return maps of the signals representing the digital symbols. The determining parameter of the return maps is estimated using a simple regression algorithm. If the parameter strongly characterizes the chaotic map, the detection can achieve very good accuracy. This strong parametric characterization can be achieved by defining the regression model with only one parameter. Using the tent maps as chaos generators, the bit-error-rate under additive white Gaussian noise is studied by computer simulations.
TL;DR: Li et al. as mentioned in this paper proposed a simple yet effective Learning to Scale (L2S) module, which dynamically separates the overlapped blobs, decomposes the accumulated density values in the ground-truth density map, and thus alleviates the long-tailed distribution of density values, which helps the model to better learn the density map.
Abstract: Recent works on crowd counting mainly leverage Convolutional Neural Networks (CNNs) to count by regressing density maps, and have achieved great progress. In the density map, each person is represented by a Gaussian blob, and the final count is obtained from the integration of the whole map. However, it is difficult to accurately predict the density map on dense regions. A major issue is that the density map on dense regions usually accumulates density values from a number of nearby Gaussian blobs, yielding different large density values on a small set of pixels. This makes the density map present a long-tailed distribution of pixel-wise density values. In this paper, we aim to address this long-tailed distribution issue in the density map. Specifically, we propose a simple yet effective Learning to Scale (L2S) module, which automatically scales dense regions into reasonable density levels. It dynamically separates the overlapped blobs, decomposes the accumulated values in the ground-truth density map, and thus alleviates the long-tailed distribution of density values, which helps the model to better learn the density map. We also explore the effectiveness of L2S in localizing people by finding the local minima of the quantized distance (w.r.t. person location map), which has a similar issue as density map regression. To the best of our knowledge, such localization method is also novel in localization-based crowd counting. We further introduce a customized dynamic cross-entropy loss, significantly improving the localization-based model optimization. Extensive experiments demonstrate that the proposed framework termed AutoScale improves upon some state-of-the-art methods in both regression and localization benchmarks on three crowded datasets and achieves very competitive performance on two sparse datasets.
TL;DR: In this paper, a novel end-to-end crowd counting framework via multi-level regression with latent Gaussian maps is proposed, which is consisted of GaussianNet, EstimateNet and Discriminator.
Abstract: Crowd counting still confronts two primary challenges: limited ability to deal with cross density levels caused by fixed density maps and lack of fine-grained or coarse-grained guidance for density estimation. In this paper, a novel end-to-end crowd counting framework via multi-level regression with latent Gaussian maps is proposed, which is consisted of GaussianNet, EstimateNet and Discriminator. GaussianNet is composed of masked Gaussian convolutional blocks and vanillia convolutional layers, to generate latent Gaussian maps adaptively for various density levels. The latent Gaussian maps are then treated as the ground truth density maps for EstimateNet, which outputs density estimations and follows the principle of adversarial learning with Discriminator. Moreover, multi-level losses are combined for density map regression guidance. Extensive experiments on the major public datasets outperform state-of-the-art ones, illustrating the superior validity of the proposed framework.
TL;DR: In this paper, an electronic terminal and a storage medium are used for crowd counting and positioning in RGB-D images, where a target detection network and a density map regression network are used to obtain the reference frame size data related to the depth.
Abstract: The invention provides a crowd counting and positioning method and system, an electronic terminal and a storage medium, and the method comprises the steps: obtaining the reference frame size data related to the depth of each sub-image in a to-be-analyzed image; Training a target detection network and a density map regression network based on the reference frame size data related to the depth; Allocating each density map which is output by the density map regression network and is changed along with depth to each network layer of the target detection network so as to be spliced with the characteristics of the network layer, Wherein the spliced network is used for carrying out crowd counting and positioning processing on the to-be-analyzed image. Crowd counting is carried out based on RGBD data and a target detection algorithm, and the head position of a person can be positioned. According to the technical scheme provided by the invention, the reference frame can be quickly marked by utilizing the depth information, the reference frame related to the depth can be designed, and the density map obtained by regression is distributed to different layers of the target detection network asthe attention map so as to improve the counting accuracy and the positioning precision.