COMPARATIVE ANALYSIS AND BAYESIAN HYPERPARAMETER OPTIMIZATION OF MACHINE LEARNING MODELS FOR MAIZE YIELD PREDICTION USING A LARGE-SCALE SYNTHETIC DATASET

محتوى المقالة الرئيسي

Yazeed Al Moaiad
Yazeed Al Moaiad

الملخص

Abstract— Accurate yield prediction of crops is critical in enhancing crop planning and ensuring food security. This work describes a comparative assessment of regression-based machine learning models for maize yield prediction using a large-scale synthetic agricultural data with 166,824 maize data points from the structured data of a million observations. The data set contains agronomic and environmental variables such as rainfall, temperature, soil type, irrigation use, use of fertilizer, and the days to harvest.


Four regression models were tested: Linear Regression, Random Forest, XGBoost, and Bayesian-optimized XGBoost. Performance was evaluated in terms of Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and coefficient of determination (R2). Linear Regression performed the best prediction (RMSE = 0.4993; R2 = 0.9143). Random Forest gave slightly lower results (R2= 0.9067), and the default XGBoost achieved competitive results (R^2= 0.9134). Bayesian optimization slightly improved the performance of the XGBoost (RMSE = 0.4999; R2 = 0.9141).


A paired t-test showed statistical significance (p < 0.001), but effect size analysis (Cohen’s's d = -0.0226) showed a trivial practical difference. Feature importance analysis proved that the use of fertilizer, rainfall, and irrigation, together, explained most of the predictive power. The results imply that under structured dataset conditions, simpler linear models might be able to perform in the competition with complex ensemble approaches.

تفاصيل المقالة

كيفية الاقتباس
Al Moaiad, Y., & Al Moaiad, Y. (2026). COMPARATIVE ANALYSIS AND BAYESIAN HYPERPARAMETER OPTIMIZATION OF MACHINE LEARNING MODELS FOR MAIZE YIELD PREDICTION USING A LARGE-SCALE SYNTHETIC DATASET. المجلة الدولية لأبحاث الحاسب المعاصرة, 1(1). https://doi.org/10.63226
القسم
Artificial intelligence

الأعمال الأكثر قراءة لنفس المؤلف/المؤلفين

1 2 > >>