Prediksi Keberlanjutan UMKM Menggunakan Pseudo-Labeling Berbasis Composite Index dan Model Ensemble Machine Learning
Main Article Content
Abstract
SME sustainability plays a crucial role in strengthening local and national economies; however, many enterprises face high risks due to limited access to capital, demand volatility, and disparities in operational capacity. This study aims to develop an SME sustainability prediction model using composite index–based pseudo-labeling and ensemble machine learning. Pseudo-labels are constructed from six key indicators revenue, profit, operating costs, average production volume, business scale, and number of employees and categorized into three sustainability classes. The dataset consists of 400 SMEs operating in Palembang City, with Random Forest based feature selection employed to identify the most relevant variables. The evaluated base models include Logistic Regression, Decision Tree, Random Forest, Support Vector Machine, and Gradient Boosting, while ensemble approaches comprise Bagging, Boosting, and Stacking. The results indicate that Logistic Regression achieves perfect accuracy (100%) on the test set, suggesting potential overfitting, whereas the Stacking ensemble provides more stable predictions with an F1-score of 0.918. Statistical validation using the Friedman and Wilcoxon tests confirms the superior performance of ensemble models compared to single learners. The contributions of this study include an objective pseudo-labeling method for unlabeled datasets and the development of a robust predictive model that can support policymakers and SME stakeholders in identifying sustainability risks and implementing evidence-based interventions.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.