Volume 5 , Issue 1 , PP: 43-61, 2021 | Cite this article as | XML | Html | PDF | Full Length Article
Ahmed A. Elngar 1 * , Mohamed Gamal 2 , Amar Fathy 3 , Basma Moustafa 4 , Omar Mahmoud 5 , Mohamed Shaban 6
Doi: https://doi.org/10.54216/JCIM.050104
We can bear in mind that each of us has plagiarized a text without realizing that it was plagiarism, Plagiarism can happen in Articles, Papers, Researches, literature, music, software, scientific, newspapers, websites, Master and PHD Thesis and many other fields, So plagiarism has become serious major problem to teachers, researchers and publishers, There are divergent opinions about how to define plagiarism and what makes plagiarism serious. So, the detecting plagiarism is very important, so in this survey we explicate the concept of "plagiarism" and provide an overview of different plagiarism software and tools to solve the plagiarism problem, and will discuss the plagiarism process, types and detection methodologies. We can define that plagiarism is the brief and the description of this sentence "someone used someone else’s mental product (such as its texts, ideas, or privacy). We suggest that what makes plagiarism so reprehensible is that it distorts scientific credit. In addition, intentional plagiarism indicates dishonesty. Moreover, there are a number of possible negative consequences of plagiarism. So we just create a framework for external plagiarism detection in which a some NLP processes are applied to process a set of suspicious and original documents, we have classified the different plagiarism detection techniques based on Lexical, Semantic, Syntactic and grammar analysis algorithms, And all of these algorithms precedes it NLP processing.
Tex  , plagiarism, NLP, detection methodologies, Lexical Analysis, Semantic Analysis, NLTK, LSA, PLSA, LDA
[1] Indurkhya, Nitin, and Frederick J. Damerau. Handbook of Natural Language Processing. Chapman & Hall/CRC, 2010.
[2] “Natural Language Toolkit¶.” Natural Language Toolkit - NLTK 3.5 Documentation, www.nltk.org/.
[3] Angry Ronald Adam & Suharjito, Plagiarism Detection Algorithm using natural language processing based on grammar analyzing.
[4] Thomas Hofmann, Probabilistic Latent Semantic Analysis.
[5] Yan, Tingxu & Maxwell, Tamsin & Song, Dawei & Hou, Yuexian & Zhang, Peng. (2010).Event-Based Hyperspace Analogue to Language for Query Expansion.
[6] Azzopardi, Leif & Girolami, Mark & Crowe, Malcolm. (2005). Probabilistic hyperspace analogue to language.
[7] (n.d.).Retrieved from https://www.cs.rochester.edu/~nelson/courses/csc_173/grammars/cfg.html
[8] Context Free Grammars. (n.d.). Retrieved from https://brilliant.org/wiki/context-free grammars/#:~:text=A context-free grammar is,, compiler design, and linguistics.
[9]Libretexts. (2020, May 18). 4.1: Context-free Grammars. Retrieved from https://eng.libretexts.org/Bookshelves/Computer _Science/Book:_ Foundations_ of_ Computation _ (Critchlow_and_Eck)/04:_Grammars/4.01:_Context-free_Grammars
[10] Context-Free Grammar Introduction. (n.d.). Retrieved from https://www.tutorialspoint.com/automata_theory/context_free_grammar_introduction.htm
[11] CFG Simplification. (n.d.). Retrieved from https://www.tutorialspoint.com/automata_theory/cfg_simplification.htm
[12] Parsing English with a Link Grammar - arXiv. (n.d.). Retrieved from https://arxiv.org/pdf/cmp-lg/9508004v1.pdf
[13] Guest. (n.d.). A Robust Parsing Algorithm for Link Grammars. Retrieved from https://mafiadoc.com/a-robust- parsing-algorithm-for-link grammars_ 5b722d8b097c47f2548b457c.html
[14] Dependency Grammar and Dependency Parsing (Joakim Nivre). Retrieved from https://cl.lingfil.uu.se/~nivre/docs/05133.pdf
[15] Dependency Treebanks :Methods, Annotation Schemes and Tools(Tuomo Kakkonen).Retrieved from https://www.researchgate. net/publication/1960118_ Dependency _Treebanks Methods_Annotation_Schemesand_Tools
[16] Dependency parser (Hays 1962).retrived from https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1162/handouts/SLoSP-2014-4-dependencies.pdf