Statistical Analysis and Development of an Ensemble-Based Machine Learning Model for Photovoltaic Fault Detection

Muhammad Hussain, Hussain Al-Aqrabi, Richard Hill

Research output: Contribution to journalArticlepeer-review


This paper presents a framework for photovoltaic (PV) fault detection based on statistical, supervised, and unsupervised machine learning (ML) approaches. The research is motivated by a need to develop a cost-effective solution that detects the fault types within PV systems based on a real dataset with a minimum number of input features. We discover the appropriate conditions for method selection and establish how to minimize computational demand from different ML approaches. Subsequently, the PV dataset is labeled as a result of clustering and classification. The labelled dataset is then trained using various ML models before evaluating each based on accuracy, precision, and a confusion matrix. Notably, an accuracy ranging from 94% to 100% is achieved with datasets from two different PV systems. The model robustness is affirmed by performing the approach on an additional real-world dataset that exhibits noise and missing values.
Original languageEnglish
Article number5492
Number of pages14
Issue number15
Early online date29 Jul 2022
Publication statusPublished - 1 Aug 2022

Cite this