Abstract
Although convolutional neural networks (CNNs) have become the mainstream for image processing and achieved great success in the past decade, due to the local characteristics, CNN is difficult to obtain global and long-range semantical information. Moreover, in some scenes, the pure RGB image-based model is difficult to accurately identify the pixel classification and finely segment the edge of objects. This study presents a hierarchical vision Transformer model named Swin-RGB-D to incorporate and exploit the depth information in depth images to supplement and enhance the ambiguous and obscure features in RGB images. In this design, RGB and depth images are used as the two inputs of the two-branch network. The upstream branch applies the Swin Transform which is capable of learning global continuous information from RGB images for segmentation; the other branch performs channel attention on depth image to abstract the feature correlation and dependency between channels and generates a weight matrix. Then matrix multiplication on the feature maps in each stage of the down-sampling process is performed for weighted multi-modal feature extraction. Then this study adds the fused maps to the up-sampled feature maps of the corresponding size, which sufficiently compensates for the distortion of feature in the sampling process. The experiment results on the two benchmark datasets show that the proposed model makes the network more sensitive to edge information.
Original language | English |
---|---|
Title of host publication | SSPS 2022 |
Subtitle of host publication | Proceedings of the 4th International Symposium on Signal Processing Systems |
Publisher | Association for Computing Machinery (ACM) |
Pages | 68-73 |
Number of pages | 6 |
ISBN (Electronic) | 9781450396103 |
DOIs | |
Publication status | Published - 25 Mar 2022 |
Event | 4th International Symposium on Signal Processing Systems - Virtual, Online, China Duration: 25 Mar 2022 → 27 Mar 2022 Conference number: 4 |
Publication series
Name | ACM International Conference Proceeding Series |
---|---|
Volume | Par F180473 |
Conference
Conference | 4th International Symposium on Signal Processing Systems |
---|---|
Abbreviated title | SSPS 2022 |
Country/Territory | China |
City | Virtual, Online |
Period | 25/03/22 → 27/03/22 |