Volume 3 , Issue 1 , PP: 05-13, 2020 | Cite this article as | XML | Html | PDF | Full Length Article
Rabeb Touati 1 * , Dr. Imen Ferchichi 2 , Dr. Imen Messaoudi 3 , Dr.Afef Elloumi Oueslati 4 , Dr. Zied Lachiri 5
The first MicroRNAs was discovered 27 years ago in the nematode C.elegans genomes. MicroRNAs (miRNAs) sequences are small and are expressed in various genomes to affect the translation or the stability of target mRNAs. These short RNA sequences are involved in targeting post-transcriptional gene regulation. The mature miRNAs are derived from longer sequence precursors (pre-miRNAs). Previous works have shown that pre-miRNAs can be classified by their species of origin using bioinformatics techniques combined with machine learning tools. In this study, we focus on the classification of Precursor microRNAs sequences, from 16 different species ranging from animals, plants, and viruses, based on the combination of the features extracted from images corresponding to DNA sequences and machine learning algorithms. As a result, our classification shows that the system based on features correspond to energy images of pre-miRNAs signals using the PNUC coding technique corresponding to the DNA sequence is very efficient in terms of miRNAs inter-genomics recognition
microRNA, Precursor microRNA, Features, scalogram, wavelet-energy, Classification
[1] Erson-Bensan, A. E., “Introduction to microRNAs in biological systems”, Methods in molecular biology (Clifton, N.J.), 1107, pp. 1–14. DOI: 10.1007/978-1-62703-748-8_1, 2014
[2] Grey, F. “Role of microRNAs in herpesvirus latency and persistence”, The Journal of general virology, 96 (Pt 4), pp. 739–51. DOI: 10.1099/vir.0.070862-0, 2015
[3] Griffiths-Jones, S. (2010) 'miRBase: microRNA sequences and annotation.', Current protocols in bioinformatics/editorial board, Andreas D. Baxevanis ... [et al.], Chapter 12, p. Unit 12.9.1-10. DOI:10.1002/0471250953.bi1209s29
[4] Saçar, M. and Allmer, J. “Machine Learning Methods for MicroRNA Gene Prediction”, in Yousef, M., and Allmer, J. (eds) miRNomics: MicroRNA Biology and Computational Analysis SE - 10. Humana Press (Methods in Molecular Biology), pp. 177–187. DOI: 10.1007/978-1-62703-748-8_10, 2014
[5] Sacar, M. D. and Allmer, J. “Data mining for microrna gene prediction: On the impact of class imbalance and feature number for microrna gene prediction”, in 2013 8th International Symposium on Health Informatics and Bioinformatics. IEEE, pp. 1–6. DOI: 10.1109/HIBIT.2013.6661685, 2013
[6] Yousef, M., Allmer, J. and Khalifa, W. “Accurate Plant MicroRNA Prediction Can Be Achieved Using Sequence Motif Features”, Journal of Intelligent Learning Systems and Applications, 8(1), pp. 9–22. DOI: 10.4236/jilsa.2016.81002. 2016
[7] Yousef, M., Allmer, J. and Khalifa, W., “Feature Selection for MicroRNA Target Prediction – Comparison of One-Class Feature Selection Methodologies”, in Proceedings of the 9th International Joint Conference on Biomedical Engineering Systems and Technologies, pp. 216–225. DOI: 10.5220/0005701602160225, 2016
[8] Ng, K. L. S., & Mishra, S. K. “De novo SVM classification of precursor microRNAs from genomic pseudo hairpins using global and intrinsic folding measures”, Bioinformatics, 23(11), 1321-1330, 2007
[9] Krek, A., Grün, D., Poy, M. N., Wolf, R., Rosenberg, L., Epstein, E. J., … Rajewsky, N. “ Combinatorial microRNA target predictions”, Nature Genetics, 37(5), 495–500. DOI:10.1038/ng1536, 2005
[10] Lim, L. P., Lau, N. C., Weinstein, E. G., Abdelhakim, A., Yekta, S., Rhoades, M. W., … Bartel, D. P. “The microRNAs of Caenorhabditis elegans”. Genes & Development, 17(8), 991–1008. DOI:10.1101/gad.1074403, 2003
[11] Yousef, M., Jung, S., Kossenkov, A. V, Showe, L. C., & Showe, M. K. “Naive Bayes for microRNA target predictions machine learning for microRNA targets”. DOI:10.1093/bioinformatics/btm484, 2007
[12] Allmer, J. “Computational and bioinformatics methods for microRNA gene prediction”, Methods in molecular biology (Clifton, N.J.), 1107, pp. 157–75. DOI: 10.1007/978-1-62703-748-8_9, 2014
[13] Saçar, M. and Allmer, J. “Machine Learning Methods for MicroRNA Gene Prediction”, in Yousef, M., and Allmer, J. (eds) miRNomics: MicroRNA Biology and Computational Analysis SE - 10. Humana Press (Methods in Molecular Biology), pp. 177–187. DOI: 10.1007/978-1-62703-748-8_10, 2014
[14] Allmer, J. and Yousef, M. “Computational methods for ab initio detection of microRNAs”, Frontiers in genetics, 3, p. 209. DOI: 10.3389/fgene.2012.00209, 2012
[15] Lai, E. C., Tomancak, P., Williams, R. W., & Rubin, G. M. “Computational identification of Drosophila microRNA” genes. Genome Biol, 4(7), R42. DOI:10.1186/gb-2003-4-7-r42, 2003
[16] Yousef, M., Khalifa, W., Acar, Ilhan Erkin, & Allmer, J. “MicroRNA categorization using sequence motifs and k-mers. BMC Bioinformatics, 18(1), 170. DOI:10.1186/s12859-017-1584-1, 2017
[17] Yousef, M., Levy, D., & Allmer, J. “Species Categorization via MicroRNAs - Based on 3’UTR Target Sites using Sequence Features”. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies - Volume 4: BIOINFORMATICS, (pp. 112–118). SciTePress. DOI:10.5220/0006593301120118, 2018
[18] Touati, R., Messaoudi, I., Oueslati, A. E., & Lachiri, Z. “A combined support vector machine-FCGS classification based on the wavelet transform for Helitrons recognition in C. elegans”, Multimedia Tools and Applications, 78(10), 13047-13066, 2019
[19] Touati, R., Oueslati, A. E., Messaoudi, I., & Lachiri, Z. “The Helitron family classification using SVM based on Fourier transform features applied on an unbalanced dataset”, Medical & biological engineering & computing, 57(10), 2289-2304, 2019
[20] Touati, R., Messaoudi, I., Oueslati, A. E., & Lachiri, Z. “Distinguishing between intra-genomic helitron families using time-frequency features and random forest approaches. Biomedical Signal Processing and Control, 54, 101579, 2019
[21] Touati, R., Messaoudi, I., Oueslati, A. E., & Lachiri, Z. “Classification of Helitron's Types in the C. elegans Genome based on Features Extracted from Wavelet Transform and SVM Methods”, In Bioinformatics , pp. 127-134. 2018
[22] Touati, R., Messaoudi, I., Oueslati, A. E., Lachiri, Z., & Kharrat, M. “New Intraclass Helitrons Classification Using DNA-Image Sequences and Machine Learning Approaches”. IRBM. 2020
[23] Parisi, V., De Fonzo, V., & Aluffi-Pentini, F., “STRING: finding tandem repeats in DNA sequences”, Bioinformatics, 19(14), 1733-1738, 2003
[24] D.S. Goodsell and R.E. Dickerson, “Bending and curvature calculations in B-DNA”, Nucleic Acids Research, vol. 22, n 24, pp: 5497-5503, Oxford University Press, 1994
[25] A. GROSSMANN, J.MORLET, Decomposition of Hardy functions into square integrable wavelets of constant shape, SIAM, J. mathematical analysis. 15 ,pp 723-736, 1984
[26] D.E. Newland, “Time-frequency and time-scale signal analysis by harmonic wavelets”, Signal Analysis and Prediction, Birkhäuser, Boston, 3-26, 1998
[27] R.J.E. MERRY, M. STEINBUCH, “Wavelet theory and applications”. Literature Study, Eindhoven University of Technology, Department of Mechanical Engineering, Control Systems Technology Group, 2005
[28] XuQS, LiangYZ. “Monte Carlo cross validation”, Chemom Intell Lab Syst 2001;56:1–11. https://doi .org /10 .1016 /S0169 -7439(00 )00122 -2.
[29] BertholdMR, et al. “KNIME: the Konstanz information mine”. In: SIGKDD explo-rations, p.319–26. 2008
[30] HammerØ., HarperDA, RyanPD. “PAST: paleontological statistics software pack-age for education and data analysis”, Palaeontol Electronica 2001;4:9. https://folk.uio .no /ohammer /past/.
[31] Yousef, M. “Hamming Distance and K-mer Features for Classification of Pre-cursor microRNAs from Different Species”. In International Conference on Smart Innovation, Ergonomics and Applied Human Factors, pp. 180-189. Springer, Cham, 2019
[32] Tanzer, A., & Stadler, P. F. “Evolution of microRNAs”. Methods in Molecular Biology (Clifton, N.J.), vol 342, pp 335–50. doi:10.1385/1-59745-123-1:335, 2006