LightGBM-Driven Earthquake Magnitude Prediction: A Comparative
Machine Learning Framework Using Global Seismic Data
Nima Khodadadi1,∗
1Department of Civil and Architectural Engineering, University of Miami, Coral Gables, FL, USA
Emails: nima.khodadadi@miami.edu
Abstract
Earthquakes represent one of the most destructive natural hazards because they cause consequential destruction
to entire communities and fatal consequences for people. Research has continued for decades because scien-
tists aim to develop better forecasting tools for seismic events, which unpredictably strike society with massive
economic losses. Research methods from classical earthquake science and statistical and physical earthquake
models do not effectively demonstrate earthquake data’s complex spatial and temporal characteristics. ML
methods generated widespread interest in prediction work because they extract understanding from extensive
data collections to produce accurate results independently of physical rules. The presented work examines
various ML models that predict earthquake magnitudes by assessing an open-access global earthquake dataset
from 2023. The evaluation consists of five predictive models, including Light Gradient Boosting Machine
(LightGBM) and Support Vector Regression (SVR), as well as k-nearest Neighbors (KNN), Ridge Regression,
along Extra Trees Regressor. The training process included stratified cross-validation and model optimization
of hyperparameters for every model. The assessment included a mixture of statistical and mathematical per-
formance indicators that measured Mean Squared Error (MSE) alongside Root Mean Squared Error (RMSE),
Mean Absolute Error (MAE), Mean Bias Error (MBE), Coefficient of Determination (R2), Nash–Sutcliffe Ef-
ficiency (NSE), Willmott Index (WI), Pearson’s Correlation Coefficient (r) and Relative Root Mean Squared
Error (RRMSE). LightGBM outperformed all evaluation models by attaining a minimum MSE value of 0.0474
and a R2 score of 0.9241. LightGBM’s leaf-wise tree-building approach, robust scalability, and native regular-
ization features enabled it to apply very well to unknown data samples without reducing computational speed.
The experimental outcomes validate LightGBM as a powerful tool for recognizing delicate patterns within
high-dimensional seismic data collections for potential use as a predictive modeling instrument in earthquake-
prone zones. ML-based forecasting systems have displayed the capability to change earthquake prediction
processes according to research outcomes. When used together, LightGBM and alternative advanced ML
systems enhance real-time early warning systems, which leads to shortened emergency response time bet, bet-
ter planning decisions, and lower numbers of human and economic losses from earthquakes. This approach,
along with open-access datasets, allows the goal of seismic risk mitigation to achieve broader transparency and
collaborative innovation through reproducible modeling strategies.
Keywords: Earthquake prediction; Seismic Data Analysis; Light Gradient Boosting Machine; Machine Learn-
ing Models