Volume 16 , Issue 1 , PP: 223-232, 2024 | Cite this article as | XML | Html | PDF | Full Length Article
Haitham S. Hasan 1 *
Doi: https://doi.org/10.54216/FPA.160115
The difficulty of automatically modifying and updating operations within Deep Learning (DL) frameworks can slow down the performance of Deep Neural Network processing (DNNs). This research presents a novel approach to software optimization by leveraging dynamically collected profile data. A unique online auto-tuning system for DNNs was developed to enhance both the training and inference phases. Python Distributed Training of Neural Networks (PyDTNN) is a lightweight toolkit designed for distributed DNN training and estimation. It is utilized to evaluate the VGG19 model on two distinct multi-core architecture options. In testing, our auto-tuning system performs comparably, if not better, than a static selection strategy. The performance of each variation of PyDTNN that employs static selection remains consistently high throughout execution. Conversely, the auto-tuned version initially performs at a set level and progressively improves as more feasible choices become available. While both variations yield similar results in training, the selection strategy outperforms all other inference options by autonomously determining the best strategy for each layer in VGG19. The new online implementation selection tool assists in choosing the best performance option from numerous alternatives while the program is running. Its key features include constructing layered judgments and thoroughly examining 35 possibilities. Our advanced systems represent the optimal choice for monitoring sustainable environmental systems with maximum effectiveness, efficiency, and timeliness.
DNNs , auto-tuning , implementation selector , Artificial intelligence , and sustainable development
[1] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, 2017.
[2] S. Pouyanfar et al., “A survey on deep learning: Algorithms, techniques, and applications,” ACM Comput. Surv., vol. 51, no. 5, pp. 92:1–92:36, Sep. 2018.
[3] B.Zheng, Z. Jiang, C. Hao , H. Shen, J. Fromm, Y. Liu, Y. Wang, L. Ceze, T. Chen, G., “DietCode: Automatic Optimization for Dynamic Tensor Programs Part of Proceedings of Machine Learning and Systems 4 (MLSys 2022)
[4] L. Zheng, C. Jia, M. Sun, Z. Wu, C. H. Yu, A. Haj-Ali, Y. Wang, J. Yang, D. Zhuo, 19 K. Sen et al., “Ansor: Generating high-performance tensor programs for deep learning,” in{OSDI}1420)th {, 2020, pp. 863–879.USENIX} Symposium on Operating Systems Design and Implementation
[5] S. Barrachina, A. Castell´o, M. Catal´an, M. F. Dolz, and J. I. Mestre, “Pydtnn: A( 22 user-friendly and extensible framework for distributed deep learning,” The Journal of 23 Supercomputing, pp. 1–17, 2021.
[6] 6. P. S. Juan, A. Castell´o, M. F. Dolz, P. Alonso-Jord´a, and E. S. Quintana-Ort´ı, “High performance and portable convolution operators for multicore processors,” inIEEE International Symposium on Computer Architecture and High Performance32nd Computing, SBAC-PAD 2020, Porto, Portugal, September 9-11, 2020. IEEE, 2020, pp. 91–98. [Online]. Available: https://doi.org/10.1109/SBAC-PAD49847.2020.00023
[7] S. Preet, MK Sharma, J Mathur “Analytical model of semi-transparent photovoltaic double-skin façade system (STPV-DSF) for natural and forced ventilation modes ,Journal of Ventilation, 2023, https://doi.org/10.1080/14733315.2021.1971873
[8] R. C. Whaley and J. J. Dongarra, “Automatically tuned linear algebra software,” in 31 Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, ser. SC ’98. USA: 32 IEEE Computer Society, 1998, p. 1–27.
[9] J. Fern´andez, A. S. Cuadrado, D. del Rio Astorga, M. F. Dolz, and J. Daniel Garc´ıa, “Probabilistic-based selection of alternate implementations for heterogeneous plat-forms,” in Algorithms and Architectures for Parallel Processing, S. Ibrahim, K.-K. R. 35 Choo, Z. Yan, and W. Pedrycz, Eds. Cham: Springer International Publishing, 2017, pp. 749–758.
[10] J. Planas, R. M. Badia, E. Ayguad´e, and J. Labarta, “Self-adaptive ompss tasks in heterogeneous environments,” inand Distributed Processing, 2013, pp. 138–149.2013 IEEE 27th International Symposium on Parallel 39 11. M. Jorda`, P. Valero-Lara, and A. J. Pen˜a, “Performance evaluation of cudnn convolution algorithms on nvidia volta gpus,” IEEE Access, vol. 7, pp. 70461–70473, 2019. 41 12. T. Ben-Nun and T. Hoefler, “Demystifying parallel and distributed deep learning: An 42 in-depth concurrency analysis,” CoRR, vol. abs/1802.09941, 2018. [Online]. Available:
[11] M Sowjanya, L Devi, “Mounting-based knowledge transfer network Model Using Aspect-based sentiment analysis. (2023) https://doi.org/10.21203/rs.3.rs-2970874/v1.This work is licensed under a CC BY 4.0 License
[12] Winograd, Arithmetic Complexity of Computations Society for Industrial and Applied Mathematics, 1980.
[13] A. Castell´o, M. F. Dolz, and E. S. Quintana-Ort´ı, “Towards portable realizations of 46 winograd-based convolution with vector intrinsics and openmp,” in 30th EUROMICRO 47 Workshop on Parallel, Distributed and Networked Processing PDP 2022appear., 2022, p. To
[14] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.
[15] 16. H Hasan., Abdul Kareem S (2013) Fingerprint image enhancement and recognition algorithms: a survey. Neural Comput Appl 23:606–1608, https://doi.org/10.1007/s00521-012-1113-0
[16] S. Haitham Hasan Mais A Al-Sharqi (2021) Hand vein recognition with rotation feature matching based on fuzzy algorithm International Journal of Nonlinear Analysis and Applications (IJNAA), doi: 10.22075/IJNAA.2021.5536
[17] Urvashi Gupta , Rohit Sharma, Multi-sensor Data Fusion based Medical Data Classification Model using Gorilla Troops Optimization with Deep Learning, Fusion: Practice and Applications, Doi: https://doi.org/10.54216/FPA.150101, Vol. 15 Issue. 1 PP. 08-09, (2024)
[18] Dilobar Isomjonovna Ruzieva, The Fusion of Digital Technologies in Small Business for Ensuring the Socio-Economic Development: Panel Data Analysis, Fusion: Practice and Applications, Doi: https://doi.org/10.54216/FPA.150106, Vol. 15 Issue. 1 PP. 66-67, (2024)
[19] Priyanka Dhaka , Ruchi Sehrawat, Adaptive Ensembled Fusion Based Deep CNN-Bilstm Model For Heart Disease Prediction In IoT, Fusion: Practice and Applications, Doi: https://doi.org/10.54216/FPA.140104, Vol. 14 Issue. 1 PP. 40-41, (2024)>
[20] Anita Madona M. , Paneer Arokiaraj S. Effectual Augmentation of Glaucoma Prediction in Retinal Fundus Images using Hybrid Level Fusion of Image Pre-Processing Techniques, Fusion: Practice and Applications, Doi: https://doi.org/10.54216/FPA.140108,Vol. 14 Issue. 1 PP. 93-104, (2024)>