Volume 4 , Issue 1 , PP: 5-14, 2021 | Cite this article as | XML | Html | PDF | Full Length Article
Vanita Jain 1 * , Mahima Swami 2 , Rishab Bansal 3
Doi: https://doi.org/10.54216/FPA.040101
Passwords act as a first line of defense against any malicious or unauthorized access to one's personal information. With the increasing digitization, it has now become even more important to choose strong passwords. In this paper, the authors analyze a 100 million Email-Password Database to perform Exploratory Data Analysis. The analysis provides valuable insights on statistics about the most common passwords being used, character set of passwords, most common domains, average length, password strength, frequencies of letters, numbers, symbols (special characters), most common letter, most common number, most common symbol, the ratio of letters, numbers, symbols in passwords which highlights the general trend that users follow while creating passwords. Using the results of this paper, users can make intelligent decisions while creating passwords for themselves, i.e., not opting for the most common features that will help them create robust and less vulnerable passwords.
Data Analysis , Username-Password Dataset , Data Security
[1] Chanda, Katha. (2016). Password Security: An Analysis of Password Strengths and Vulnerabilities. International Journal of Computer Network and Information Security. 8. 23-30. 10.5815/ijcnis.2016.07.04.
[2] Li, Yue & Wang, Haining& Sun, Kun. (2017). Personal Information in Passwords and Its Security Implications. IEEE Transactions on Information Forensics and Security. PP. 1-1. 10.1109/TIFS.2017.2705627.
[3] Cheng, Long & Liu, Fang & Yao, Danfeng. (2017). Enterprise data breach: causes, challenges, prevention, and future directions: Enterprise data breach. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 7. e1211. 10.1002/widm.1211.
[4] Yıldırım, M., Mackie, I. Encouraging users to improve password security and memorability. Int. J. Inf. Secure. 18, 741–759 (2019). https://doi.org/10.1007/s10207-019-00429-y
[5] De Cristofaro, Emiliano & Du, Honglu&Freudiger, Julien &Norcie, Greg. (2013). Two-Factor or not Two-Factor? A Comparative Usability Study of Two-Factor Authentication. USEC. 10.14722/usec.2014.23025.
[6] Pinkas, Benny & Sander, Tomas. (2003). Securing Passwords Against Dictionary Attacks. Proceedings of the ACM Conference on Computer and Communications Security. 10.1145/586110.586133.
[7] Bošnjak, Leon &Sres, J. &Brumen, B.. (2018). Brute-force and dictionary attack on hashed real-world passwords. 1161-1166. 10.23919/MIPRO.2018.8400211.
[8] 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2018 - Proceedings (2018)
[9] https://www.kaggle.com/wjburns/common-password-list-rockyoutxt
[10] https://crackstation.net
[11] https://weakpass.com/download
[12] https://wiki.skullsecurity.org/Passwords
[13] Tull, L.. (2002). Library systems and Unicode: A review of the current state of development. 21. 181-185.
[14] Hahn, Brian & Valentine, Daniel. (2013). ASCII Character Codes. 10.1016/B978-0-12-394398-9.00026-5.
[15] https://github.com/hmaverickadams/breach-parse
[16] https://www.python.org
[17] https://github.com/rishab-rb/EDA_Passwords/blob/main/FINAL%20CODE.ipynb
[18] https://github.com/rishab-rb/EDA_Passwords/blob/main/EDA.ipynb