Abstract
Learning is the heart of intelligence. The focus in machine learning is to automate methods that achieve objectives, improve predictions or encourage informed behavior. Feature selection is a vital step in data analysis that often reduces dataset dimensionality by eliminating irrelevant and/or redundant attributes to simplify the learning process or improve outcomes' quality. This research critically analyses different filter methods based on ranking procedures (Information Gain (IG), Chi-square (CHI), V-score, Fisher Score, mRMR, Va and ReliefF) and identifies possible challenges that arise. We particularly concentrate on how threshold determination can affect results of different filter methods based on ranked scores. We show that this issue is vital, especially in the era of big data in which users deal with attributes in the magnitudes of tens of thousands with only a limited number of instances.
Original language | English |
---|---|
Title of host publication | 2019 International Conference on Computer and Information Sciences, ICCIS 2019 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 1-4 |
Number of pages | 4 |
ISBN (Electronic) | 9781538681251 |
ISBN (Print) | 9781538681268 |
DOIs | |
Publication status | Published - 16 May 2019 |
Event | 2019 International Conference on Computer and Information Sciences - Sakaka, Saudi Arabia Duration: 3 Apr 2019 → 4 Apr 2019 http://www.wikicfp.com/cfp/servlet/event.showcfp?eventid=81624©ownerid=130722 https://www.ju.edu.sa/en/iccis-2019/home/ |
Conference
Conference | 2019 International Conference on Computer and Information Sciences |
---|---|
Abbreviated title | ICCIS 2019 |
Country/Territory | Saudi Arabia |
City | Sakaka |
Period | 3/04/19 → 4/04/19 |
Internet address |