This paper presents a framework for photovoltaic (PV) fault detection based on statistical, supervised, and unsupervised machine learning (ML) approaches. The research is motivated by a need to develop a cost-effective solution that detects the fault types within PV systems based on a real dataset with a minimum number of input features. We discover the appropriate conditions for method selection and establish how to minimize computational demand from different ML approaches. Subsequently, the PV dataset is labeled as a result of clustering and classification. The labelled dataset is then trained using various ML models before evaluating each based on accuracy, precision, and a confusion matrix. Notably, an accuracy ranging from 94% to 100% is achieved with datasets from two different PV systems. The model robustness is affirmed by performing the approach on an additional real-world dataset that exhibits noise and missing values.