Abstract
The noticeable growth in the adoption of Internet of Things (IoT) technologies, has led to the generation of large amounts of data usually from sensor devices. When dealing with massive amounts of data, it is very common to observe databases with large amounts of missing values. This is a challenge for data miners because various methods for data analysis only work well on complete databases. A popular way to deal with this challenge is to fill-in (impute) missing values using adequate estimation techniques. Unfortunately, a good number of existing methods rely on all the observed values in the entire dataset to estimate missing values, which significantly causes unfavourable effects (low accuracy and high complexity) on imputed results. In this paper, we propose a novel imputation technique based on data clustering and a robust selection of adequate imputation equations for each missing datapoint. We evaluate our proposed method using six University of California Irvine (UCI) datasets, and relevant comparison with five recently proposed imputation methods. The results presented showed that the performance of the proposed imputation method is comparable with the Local Similarity Imputation (LSI) technique in terms of imputation accuracy, but is significantly less complex than all the existing methods identified.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th International Conference on Internet of Things, Big Data and Security |
Subtitle of host publication | IoTBDS 2020 |
Editors | Gary Wills, Péter Kacsuk, Victor Chang |
Publisher | SciTePress |
Pages | 130-137 |
Number of pages | 8 |
Volume | 1 |
ISBN (Electronic) | 9789897584268 |
DOIs | |
Publication status | Published - 7 May 2020 |
Event | 5th International Conference on Internet of Things, Big Data and Security - Online Streaming, Virtual, Online Duration: 7 May 2020 → 9 May 2020 Conference number: 5 http://www.iotbds.org/Home.aspx?y=2020 |
Conference
Conference | 5th International Conference on Internet of Things, Big Data and Security |
---|---|
Abbreviated title | IoTBDS2020 |
City | Virtual, Online |
Period | 7/05/20 → 9/05/20 |
Other | Was meant to be in Prague, but occurred online due to Covid |
Internet address |