Volume 7 , Issue 2 , PP: 71-80, 2022 | Cite this article as | XML | Html | PDF | Full Length Article
Things Irina V. Pustokhina 1 * , Denis A. Pustokhin 2
Doi: https://doi.org/10.54216/JISIoT.070207
The Internet of Things (IoT) has become a ubiquitous technology that enables the collection and analysis of large amounts of data. However, the limited resources of IoT devices pose challenges to enabling responsive decision-making. Many communications are required for network training, yet network updates can be very big if they include many parameters. Participants and the IoT ecosystem both bear the brunt of federated learning's high Latency due to the magnitude of its communications infrastructure requirements. In this paper, we propose a Federated Knowledge Purification (FKP) approach based on dynamic reciprocal knowledge purification and adaptive gradient compression, two strategies that allow for low-latency communication without sacrificing effectiveness, which enables responsive IoT devices with limited resources. The FKP approach leverages a collaborative learning approach to enable IoT devices to learn from each other's experiences while preserving the privacy of their data. A smaller model is trained on the aggregated knowledge of a larger model trained on a centralized server, and this smaller model can be deployed on IoT devices to enable responsive decision-making with limited computational resources. Experimental results demonstrate the effectiveness of the proposed approach in improving the performance of IoT devices while maintaining the privacy of their data. The proposed approach also outperforms existing federated learning methods in terms of communication efficiency and convergence speed.
Internet of Things (IoT) , Federated Learning , Knowledge Purification , Latency , Communication Overhead
[1]. Yang, C., Xie, L., Qiao, S., & Yuille, A. (2018). Knowledge distillation in generations: More tolerant teachers educate better students. arXiv preprint arXiv:1805.05551.
[2]. Zhu, X., & Gong, S. (2018). Knowledge distillation by on-the-fly native ensemble. Advances in neural information processing systems, 31.
[3]. Sau, B. B., & Balasubramanian, V. N. (2016). Deep model compression: Distilling knowledge from noisy teachers. arXiv preprint arXiv:1610.09650 .
[4]. Song, X., Feng, F., Han, X., Yang, X., Liu, W., & Nie, L. (2018, June). Neural compatibility mode ling with attentive knowledge distillation. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (pp. 5-14).
[5]. Yu, R., Li, A., Morariu, V. I., & Davis, L. S. (2017). Visual relationship detection with internal and external linguistic knowledge distillation. In Proceedings of the IEEE international conference on computer vision (pp. 1974-1982).
[6]. Seo, H., Park, J., Oh, S., Bennis, M., & Kim, S. L. (2020). Federated knowledge distillation. arXiv preprint arXiv:2011.02367.
[7]. Lee, S. H., Kim, D. H., & Song, B. C. (2018). Self-supervised knowledge distillation using singular value decomposition. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 335 -350).
[8]. Wu, C., Wu, F., Lyu, L., Huang, Y., & Xie, X. (2022). Communication-efficient federated learning via knowledge distillation. Nature communications, 13(1), 1-8.
[9]. Liu, X., Wang, X., & Matwin, S. (2018, November). Improving the interpretability of deep neural networks with knowledge distillation. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 905-912). IEEE.
[10]. Lu, L., Guo, M., & Renals, S. (2017, March). Knowledge distillation for small -footprint highway networks. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4820-4824). IEEE.
[11]. Asami, T., Masumura, R., Yamaguchi, Y., Masataki, H., & Aono, Y. (2017, March). Domain adaptation of dnn acoustic models using knowledge distillation. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5185-5189). IEEE.
[12]. Xu, Z., Hsu, Y. C., & Huang, J. (2017). Training shallow and thin networks for acceleration via knowledge distillation with conditional adversarial networks. arXiv preprint arXiv:1709.00513 .
[13]. Wang, T., Zhu, J. Y., Torralba, A., & Efros, A. A. (2018). Dataset distillation. arXiv preprint arXiv:1811.10959.
[14]. Hou, S., Pan, X., Loy, C. C., Wang, Z., & Lin, D. (2018). Lifelong learning via progressive distillation and retrospection. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 437-452).
[15]. Xu, H., Su, Y., Zhao, Z., Zhou, Y., Lyu, M. R., & King, I. (2018). Deepobfuscation: Securing the structure of convolutional neural networks via knowledge distillation. arXiv preprint arXiv:1806.10313.
[16]. Ge, S., Zhao, S., Li, C., & Li, J. (2018). Low-resolution face recognition in the wild via selective knowledge distillation. IEEE Transactions on Image Processing, 28(4), 2051-2062.
[17]. Ma, N., Zhang, X., Zheng, H. T., & Sun, J. (2018). Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European conference on computer vision (ECCV) (pp. 116 -131).
[18]. Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., & Kim, S. L. (2018). Communication-efficient on-device machine learning: Federated distillation and augmentation under non-iid private data. arXiv preprint arXiv:1811.11479.
[19]. Yurochkin, Mikhail, Mayank Agarwal, Soumya Ghosh, Kristjan Greenewald, Nghia Hoang, and Yasaman Khazaeni. "Probabilistic federated neural matching." (2018).
[20]. Lopes, R. G., Fenu, S., & Starner, T. (2017). Data-free knowledge distillation for deep neural networks. arXiv preprint arXiv:1710.07535.