A Unified Linear Algebra–Centric Framework for Integrating Query Processing and GPU-Accelerated Machine Learning

Abdulnaser Rashid¹, Zahra I. Mahmoud², Mawahib Elamin³, Amel H. Abdalla³,
Adil O. Y. Mohamed^1,*

¹Department of Computer Science, College of Computer, Qassim University, Buraydah 51452, Saudi Arabia

²College of Public Health and Health Informatic, Hail University, Saudi Arabia

³Department of Mathematics, College of Science, Qassim University, Buraydah, 51452, Saudi Arabia

Emails: arshied@qu.edu.sa; Zahra.Mahmoud@uoh.edu.sa; ma.elhag@qu.edu.sa; Ahabdallh@qu.edu.sa; adi.mohamed@qu.edu.sa

Abstract

The increasing adoption of large-scale machine learning (ML) applications has exposed critical performance limitations in current data processing pipelines, particularly due to the separation between relational query execution and ML inference. This separation introduces redundant computations, excessive data materialization, and inefficient utilization of GPU Matrix Processing [10] resources. In this paper, we present a unified execution framework that integrates relational query processing and machine learning prediction by representing both as linear algebra operations. Leveraging algebraic properties such as associativity and distributivity, we introduce an operator fusion [8] strategy that enables query operators and ML models to be jointly executed on GPU Matrix Processing [10] architectures. This approach reduces intermediate data movement and enables end-to-end pipeline execution within a single linear algebra runtime. We analyze the computational complexity of the proposed fusion strategy and discuss its applicability to star-schema workloads commonly found in analytical systems. Experimental insights from prior studies indicate that linear algebra–based query execution combined with operator fusion [8] can yield substantial performance improvements over conventional GPU Matrix Processing [10]-accelerated pipelines, while maintaining scalability and portability. The proposed framework provides a viable foundation for future data-intensive systems that aim to unify analytics and machine learning on heterogeneous computing platforms. [1–3,14–16] This work unifies relational query processing and ML inference within a single algebraic runtime on GPUs, rather than coupling independent GPU-accelerated stages, thereby enabling cross-stage optimization and eliminating redundant materialization. Unlike existing GPU-accelerated databases and tensor-based query processors, the proposed framework provides a system-level unification of relational analytics and machine learning inference, rather than treating them as isolated or sequential stages. The framework is backend-agnostic and applicable to modern tensor runtimes and heterogeneous accelerator platforms, making it suitable for next-generation data-intensive systems.

Keywords: Linear Algebra–based Query Processing (LAQ); Operator Fusion; GPU Matrix Processing Acceleration; Sparse Matrix Computation; SpMM; Machine Learning Inference; Physical Matrix Design