Babcock University Research Portal

Evaluation of Dominant Text Data Compression Techniques

Authors: ADEBAYO Adewale

Publication Type: Journal article

Journal: International Journal Of Application Or Innovation In Engineering & Management

ISSN Number:

Downloads

Views

Abstract

Compression refers to reducing the quantity of data, bits used to represent, store and/or transmit file content, without excessively

reducing the quality of the original data. Data Compression technique can be Lossless which enables exact reproduction of the

original data on decomposition or Lossy which does not. Text data compression techniques are normally lossless. There are many

text data compression algorithms, and it is essential that best ones be identified and evaluated towards optimizing compression

purposes. This research focused on proffering key compression algorithms, and evaluating them towards ensuring selection of

better algorithms over poorer ones. Dominant text data compression algorithms employing Statistical and Dictionary based

compression techniques were determined through qualitative analysis of extant literature for convergence and expositions using

inductive approach. The proffered algorithms were implemented in Java, and were evaluated along compression ratio,

compression and decompression time using text files obtained from Canterbury corpus. Huffman and Arithmetic coding

techniques were proffered for statistical, and Lempel-Ziv-Welch (LZW) for dictionary-based technique. LZW was indicated the

best with the highest compression ratio of 2.36314, followed by Arithmetic with compression ratio of 1.70534, and Huffman with

compression ratio of 1.62877. It was noted that the performance of the data compression algorithms on compression time and

decompression time depend on the characteristics of the files, the different symbols and symbol frequencies. Usage of LZW, in

data storage and transmission, would go a long way in optimizing compression purposes.

External Link
Download
Digital Object Identifier

Keywords

ADEBAYO,A. . (0000). Evaluation of Dominant Text Data Compression Techniques, 3 (), 162-162.

ADEBAYO,A. . "Evaluation of Dominant Text Data Compression Techniques" 3, no (), (0000): 162-162.

ADEBAYO,A. and . (0000). Evaluation of Dominant Text Data Compression Techniques, 3 (), pp162-162.

ADEBAYOA, . Evaluation of Dominant Text Data Compression Techniques. 0000, 3 ():162-162.

ADEBAYO,Adewale , . "Evaluation of Dominant Text Data Compression Techniques", 3 . (0000) : 162-162.

A.Adewale , "Evaluation of Dominant Text Data Compression Techniques" vol.3, no., pp. 162-162, 0000.