Fusion: Practice and Applications
FPA
2692-4048
2770-0070
10.54216/FPA
https://www.americaspg.com/journals/show/789
2018
2018
Design of Effective Lossless Data Compression Technique for Multiple Genomic DNA Sequences
Software Engineering and IT Department, Ecole de technologie superieure, Montreal (Qc), Canada
..
Department of Computer Engineering, Halil University, Beyoglu, Istanbul, Turkey
Alireza
Souri
In recent years, a massive amount of genomic DNA sequences are being created which leads to the development of new storing and archiving methods. There is a major challenge to process, store or transmit the huge volume of DNA sequences data. To lessen the number of bits needed to store and transmit data, data compression (DC) techniques are proposed. Recently, DC becomes more popular, and large number of techniques is proposed with applications in several domains. In this paper, a lossless compression technique named Arithmetic coding is employed to compress DNA sequences. In order to validate the performance of the proposed model, the artificial genome dataset is used and the results are investigated interms of different evaluation parameters. Experiments were performed on artificial datasets and the compression performance of Arithmetic coding is compared to Huffman coding, LZW coding, and LZMA techniques. From simulation results, it is clear that the Arithmetic coding achieves significantly better compression with a compression ratio of 0.261 at the bit rate of 2.16 bpc.
2021
2021
17
25
10.54216/FPA.060103
https://www.americaspg.com/articleinfo/3/show/789