Volume 18 , Issue 1 , PP: 304-320, 2025 | Cite this article as | XML | Html | PDF | Full Length Article
Diyar Mohammed 1 , Esraa Hadi Alwan 2 , Ahmed Fanfakh 3
Doi: https://doi.org/10.54216/FPA.180121
Compiler optimization is crucial in improving program performance by improving execution speed, reducing memory usage, and minimizing energy consumption. Nevertheless, modern compilers, such as LLVM, with their numerous optimization passes, present a significant challenge in identifying the most effective sequence for optimizing a program. This study addresses the complex problem of determining optimal compiler optimization sequences within the LLVM framework, which encompasses 64 optimization passes, causing in an immense search space of 264264. Identifying the ideal sequence for even simple code can be an arduous task, as the interactions between passes are intricate and unpredictable. The primary objective of this research is to utilize machine-learning techniques to predict effective optimization sequences that outperform the default -O2 and -O3 optimization flags. The methodology involves generating 2,000 sequences per program and picking the one that achieves the shortest execution time. Three machine learning models—K-Nearest Neighbor (KNN), Decision Tree (DT), and Feedforward Neural Network (FFNN)—were employed to predict the optimization sequences based on features extracted from programs during execution. The study used benchmarks from Polybench, Shootout, and Stanford suites, each with varying problem sizes, to validate the proposed technique. The results demonstrate that the KNN model produced optimization sequences with superior performance compared to DT and FFNN. On average, KNN achieved execution times that were 2.5 times faster than those achieved using the O3 optimization flag. This research contributes to the field by programming the process of selecting optimal compiler sequences, which significantly reduces execution time and eliminates the need for manual tuning. It highlights the potential of machine learning in compiler optimization, offering a robust and scalable approach to improving program performance and setting the foundation for future advancements in the domain.
Compiler optimization , Machine learning , Optimization sequence , Program performance , LLVM framework
[1] L. H. Alhasnawy, E. H. Alwan, and A. B. M. Fanfakh, "Using Machine Learning to Predict the Sequences of Optimization Passes," in Proceedings of a Conference, 2020, pp. 139–156. doi: 10.1007/978-3-030-55340-1_10.
[2] E. H. Alwan and A. K. M. Al-Qurabat, "Candidate Best Optimizations Sequences for Code Size Reduction," Ingénierie des systèmes d'information, vol. 29, no. 4, pp. 1611–1617, Aug. 2024, doi: 10.18280/isi.290434.
[3] P. Ghimire, "Machine learning-based prediction models for budget forecast," Machine Learning-Based Prediction Models for Budget Forecast, 2023.
[4] T. C. de Souza Xavier and A. F. da Silva, "Exploration of compiler optimization sequences using a hybrid approach," Computing and Informatics, vol. 37, no. 1, pp. 165–185, 2018.
[5] N. L. Queiroz, L. G. A. Rodriguez, and A. F. da Silva, "Combining machine learning with a genetic algorithm to find good compiler optimizations sequences," in Proceedings of ICEIS, vol. 1, 2017, pp. 397–404.
[6] M. H. Almohammed, A. B. M. Fanfakh, and E. H. Alwan, "Parallel Genetic Algorithm for Optimizing Compiler Sequences Ordering," NTICT 2020, CCIS 1183, pp. 128–138, 2020.
[7] Z. Pan and R. Eigenmann, "Fast and effective orchestration of compiler optimizations for automatic performance tuning," in Proceedings of the International Symposium on Code Generation and Optimization, IEEE Computer Society, 2006, pp. 319–332.
[8] S. Purini and L. Jain, "Finding good optimization sequences covering program space," ACM Transactions on Architecture and Code Optimization (TACO), vol. 9, no. 4, article 56, 2013.
[9] M. H. Almohammed, E. H. Alwan, and A. B. M. Fanfakh, "Programs features clustering to find optimization sequence using genetic algorithm," in ICICCT 2019, LAIS, vol. 9, Springer, Cham, 2020, pp. 40–50. doi: 10.1007/978-3-030-38501-9_4.
[10] Z. S. Alkaaby, E. H. Alwan, and A. B. M. Fanfakh, "Finding a good global sequence using multi-level genetic algorithm," Journal of Engineering and Applied Sciences, vol. 13, pp. 9777–9783, 2018.
[11] R. M. Al Baity, E. H. Alwan, and A. B. M. Fanfakh, "A top popular approach for the automatic tuning of compiler optimizations," in AIP Conference Proceedings, vol. 2398, no. 1, AIP Publishing, 2022.
[12] S. Kulkarni and J. Cavazos, "Mitigating the compiler optimization phase-ordering problem using machine learning," in Proceedings of the ACM International Conference on Object-Oriented Programming Systems, Languages, and Applications, New York, NY, USA: ACM, 2012, pp. 147–162. doi: 10.1145/2384616.2384628.
[13] H. Ashouri, G. Mariani, G. Palermo, and C. Silvano, "A Bayesian network approach for compiler auto-tuning for embedded processors," in 2014 IEEE 12th Symposium on Embedded Systems for Real-time Multimedia (ESTIMedia), IEEE, 2014, pp. 90–97. doi: 10.1109/ESTIMedia.2014.6962349.
[14] N. L. Queiroz Junior, L. G. A. Rodriguez, and A. F. da Silva, "Combining Machine Learning with a Genetic Algorithm to Find Good Compiler Optimizations Sequences," in Proceedings of the 19th International Conference on Enterprise Information Systems, SCITEPRESS, 2017, pp. 397–404. doi: 10.5220/0006270403970404.
[15] L. G. A. Martins, R. Nobre, J. M. P. Cardoso, A. C. B. Delbem, and E. Marques, "Clustering-Based Selection for the Exploration of Compiler Optimization Sequences," ACM Transactions on Architecture and Code Optimization, vol. 13, no. 1, pp. 1–28, Apr. 2016, doi: 10.1145/2883614.
[16] J. Cavazos, G. Fursin, F. Agakov, E. Bonilla, M. F. P. O’Boyle, and O. Temam, "Rapidly Selecting Good Compiler Optimizations using Performance Counters," in International Symposium on Code Generation and Optimization (CGO’07), IEEE, 2007, pp. 185–197. doi: 10.1109/CGO.2007.32.
[17] J. R. Quinlan, "Induction of decision trees," Machine Learning, vol. 1, pp. 81–106, 1986.
[18] Z. Yu, F. Haghighat, B. C. Fung, et al., "A decision tree method for building energy demand modeling," Energy and Buildings, vol. 42, no. 10, pp. 1637–1646, 2010.
[19] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Belmont, CA: Wadsworth, 1986.
[20] A. A. Amra and A. Y. Maghari, "Students performance prediction using KNN and Naive Bayesian," in 2017 8th International Conference on Information Technology (ICIT), IEEE, 2017, pp. 909–913.
[21] S. V. Kozyrev, "Classification by ensembles of neural networks," Steklov Mathematical Institute, Feb. 21, 2012.
[22] Jaafar, N. Ismail, M. Tajjudin, R. Adnan, and M. H. F. Rahiman, "Z-score and feed-forward neural network (FFNN) for flood modeling at Kuala Krai Station," in 2016 7th IEEE Control and System Graduate Research Colloquium (ICSGRC), IEEE, 2016, pp. 92–97. doi: 10.1109/ICSGRC.2016.7813308.