<p>Interpretable Rainfall Forecasting Using SHAP-Enhanced Machine</p>
<p>Learning: A Case Study on U.S. Urban Climate Data (2024&ndash;2025)</p>
<p>Khaled Sh. Gaber1,&lowast; , Mahmoud Elshabrawy Mohamed1,&lowast;</p>
<p>1Computer Science and Intelligent Systems Research Center, Blacksburg 24060, Virginia, USA</p>
<p>Emails: khsherif@jcsis.org; mshabrawy@jcsis.org,</p>
<p>Abstract</p>
<p>Correct rainfall prediction is fundamental for developing resilient climates, guaranteeing sustainable farms and</p>
<p>planned water distribution networks, and reducing possible disasters. Many meteorological elements affect</p>
<p>rainfall patterns because rainfall shows nonlinear behavior and dependence across different timescales and</p>
<p>diverse spatial areas. Multiple problematic features defeat conventional forecasting techniques because they</p>
<p>produce insufficient accurate predictions of short-duration precipitation patterns. Because of rising climate</p>
<p>variability, we require predictive frameworks built with data with strong performance abilities and human-</p>
<p>understandable features. In this paper, we establish a machine learning that predicts daily rainfall in advance</p>
<p>with a refined dataset consisting of detailed weather measurements spanning 20 United States metropolises</p>
<p>from 2024 to 2025. The selected dataset contains six atmospheric factors: temperature, humidity, wind speed,</p>
<p>and cloud cover with pressure and precipitation and a binary outcome to show rainfall prediction for the</p>
<p>following day. Random Forest and Support Vector Machine (RBF) KNearest Neighbors (KNN), Logistic</p>
<p>Regression, Naive Bayes, and Linear SVM formed the set of machine learning models that underwent training</p>
<p>and evaluation. The SHAP method was integrated to improve prediction interpretation and trust through</p>
<p>Shapley additive explanations value measures. SHAP values provided quantitative measurement and graphical</p>
<p>visualization to explain the role of each input variable in making individual prediction outcomes. SHAP</p>
<p>analysis of the model showcased precipitation and humidity as their most crucial features because they match</p>
<p>the principles of meteorological theory and demonstrate the rational decision-making process of the model.</p>
<p>The Random Forest approach scored the highest performance from all models, reaching perfect measurements</p>
<p>for Precision = 100, Recall = 100 and F1-score = 100. The RBF SVM model alongside KNN showed strong</p>
<p>performance since they delivered F1 scores of 0.97 and 0.94. The evaluation revealed that Logistic Regression,</p>
<p>Linear SVM and Naive Bayes achieved satisfactory results, providing F1-score ratings between 0.76 and</p>
<p>0.77. The SHAP-based diagnostic results showed that Random Forest yielded exceptional classification results</p>
<p>while simultaneously showing consistent weighting patterns between features across diverse locations. The</p>
<p>integration of the Random Forest model with SHAP interpretation creates an effective solution for rainfall</p>
<p>forecasting despite its high prediction capabilities. The model achieves complete prediction accuracy with</p>
<p>precise explanation capabilities, generating trust for using it in actual deployment scenarios. According to</p>
<p>the results, weather-sensitive sectors like agriculture, urban planning, and disaster response can leverage these</p>
<p>transparent machine learning systems into their decision-making support pipelines. The approach described</p>
<p>has the potential to become a model structure for conducting future predictive analyses in meteorology and</p>
<p>environmental science.</p>
<p>Keywords: Rainfall prediction; SHAP; Machine Learning; Random Forest</p>