Securing DNS over HTTPS: A Machine Learning Study on Traffic
Classification Using DoHBrw-2020
Al-Seyday.T. Qenawy 1 ∗, Hussein Alkattan2, Amany Khaled3
1Intelligent Systems and Machine Learning Lab, Shenzhen 518000, China
2Department of System Programming, South Ural State University, 454080 Chelyabinsk, Russia
3Department of Clinical Pharmacy and Pharmacy Practice, Faculty of Pharmacy, Mansoura University,
Mansoura, Egypt
Emails: S.Qenawy@asia.com, alkattan.hussein92@gmail.com, amany24khaled@gmail.com
Abstract
This paper provides a detailed review of related works for classifying secure DNS traffic, with emphasis on
the identification of threats relating to DoH using machine learning algorithms. In the present study, with the
help of DoHBrw-2020 dataset consisting the network traffic data of DoH protocol during its testing phase, we
compare the performance of various machine learning algorithms: Decision Tree, SVM, KNN, Na¨ıve Bayes,
Neural Network (MLP), Gradient Boosting, and SVM with RBF kernel. As for each model, we have Accuracy,
Sensitivity, Specificity, Positive Predicted Value, Negative Predicted Value, and F Score. They reveal the fact
that the chosen Decision Tree model produces the highest accuracy and equals to 99. 65% and all the criteria of
the assessment should be well managed. It is important that the various machine learning methods contribute
to the study’s discovery of high potential in improving DNS traffic security and offers an understanding on the
best models to use for real-time detection of DoH threats. From these outcomes, it can draw many perspectives
to the further creation and implementation of safer DNS solutions within contemporary information security
paradigms.
Keywords: DNS over HTTPS, Machine Learning, Traffic Classification, DoHBrw-2020, Cybersecurity