Crowd counting on the image is a challenging problem. Many neural network-based methods usually use two-branch and multi-branch networks to extract high-level features of different scales or densities, and then merge these features by a fusion operation. Although these methods can reduce the error of crowd counting, it makes the amount of parameters is enormous, so that the efficiency of training and optimization of the model is low, and the calculation resource consumption is high. To this end, a residual network based on depthwise separable convolution is proposed for image crowd counting. The network can not only reduce the amount of calculation through depthwise separable convolution, but also deepen the network depth through the residual structure to extract more effective high-level features. The experiment proves that, compared with the start-of-the-art methods, the method in this paper dramatically reduces the parameter amount to 1.91 Million when the accuracy is comparable.
|Number of pages||7|
|Journal||Journal of Physics: Conference Series|
|Early online date||13 Oct 2020|
|Publication status||Published - 13 Oct 2020|
|Event||3rd International Conference on Computer Information Science and Application Technology - Dali, China|
Duration: 17 Jul 2020 → 19 Jul 2020
Conference number: 3