TY - JOUR
T1 - Hierarchical visual localization for visually impaired people using multimodal images
AU - Cheng, Ruiqi
AU - Hu, Weijian
AU - Chen, Hao
AU - Fang, Yicheng
AU - Wang, Kaiwei
AU - Xu, Zhijie
AU - Bai, Jian
N1 - Funding Information:
This work has financial support from the ZJU-Sunny Photonics Innovation Center (No. 2020-03).
Publisher Copyright:
© 2020
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2021/3/1
Y1 - 2021/3/1
N2 - Localization is one of the crucial issues in assistive technology for visually impaired people. In this paper, we propose a novel hierarchical visual localization pipeline based on the wearable assistive navigation device for visually impaired people. The proposed pipeline involves the deep descriptor network, 2D-3D geometric verification and online sequence matching. Images in different modalities (RGB, Infrared and Depth) are fed into Dual Desc network to generate robust attentive global descriptors and local features. The global descriptors are leveraged to retrieve the coarse candidates of query images. The 2D local features, as well as 3D sparse point cloud, are used in geometric verification to select the optimal results from the retrieved candidates. Finally, sequence matching robustifies the localization results by synthesizing the verified results of successive frames. The proposed unified descriptor network Dual Desc surpasses the state-of-the-art NetVLAD and its variant on the task of image description. Validated on the real-world dataset captured by the wearable assistive device, the proposed visual localization utilizes multimodal images to overcome the disadvantages of RGB images and robustifies the localization performance by deep descriptor network and hierarchical pipeline. In the challenging scenarios of the Yuquan dataset, the proposed method achieves the F1 score of 0.77 and the mean localization error of 2.75, which is satisfactory in practical use.
AB - Localization is one of the crucial issues in assistive technology for visually impaired people. In this paper, we propose a novel hierarchical visual localization pipeline based on the wearable assistive navigation device for visually impaired people. The proposed pipeline involves the deep descriptor network, 2D-3D geometric verification and online sequence matching. Images in different modalities (RGB, Infrared and Depth) are fed into Dual Desc network to generate robust attentive global descriptors and local features. The global descriptors are leveraged to retrieve the coarse candidates of query images. The 2D local features, as well as 3D sparse point cloud, are used in geometric verification to select the optimal results from the retrieved candidates. Finally, sequence matching robustifies the localization results by synthesizing the verified results of successive frames. The proposed unified descriptor network Dual Desc surpasses the state-of-the-art NetVLAD and its variant on the task of image description. Validated on the real-world dataset captured by the wearable assistive device, the proposed visual localization utilizes multimodal images to overcome the disadvantages of RGB images and robustifies the localization performance by deep descriptor network and hierarchical pipeline. In the challenging scenarios of the Yuquan dataset, the proposed method achieves the F1 score of 0.77 and the mean localization error of 2.75, which is satisfactory in practical use.
KW - Deep Image Descriptor
KW - Place Recognition
KW - Geometric Validation
KW - Sequence Matching
KW - RGB-D-IR Image
KW - Geometric validation
KW - Place recognition
KW - Sequence matching
KW - RGB-D-IR image
KW - Deep image descriptor
UR - http://www.scopus.com/inward/record.url?scp=85089425044&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2020.113743
DO - 10.1016/j.eswa.2020.113743
M3 - Article
VL - 165
JO - Expert Systems with Applications
JF - Expert Systems with Applications
SN - 0957-4174
M1 - 113743
ER -