analysis of dicom image compression alternative...

12
Research Article Analysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah Rahmat, 1 T. S. M. Andreas, 1 Fahmi Fahmi, 2 Muhammad Fermi Pasha , 3 Mohammed Yahya Alzahrani , 4 and Rahmat Budiarto 4 1 Department of Information Technology, Faculty of Computer Science and Information Technology, Universitas Sumatera Utara, Medan 20155, Indonesia 2 Department of Electrical Engineering, Faculty of Engineering, Universitas Sumatera Utara, Medan 20155, Indonesia 3 Malaysia School of Information Technology, Monash University, Bandar Sunway 47500, Malaysia 4 College of Computer Science and Information Technology, Albaha University, Al Bahah, Saudi Arabia Correspondence should be addressed to Muhammad Fermi Pasha; [email protected] Received 24 September 2018; Revised 14 March 2019; Accepted 9 April 2019; Published 17 June 2019 Academic Editor: Angkoon Phinyomark Copyright © 2019 Romi Fadillah Rahmat et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Compression, in general, aims to reduce file size, with or without decreasing data quality of the original file. Digital Imaging and Communication in Medicine (DICOM) is a medical imaging file standard used to store multiple information such as patient data, imaging procedures, and the image itself. With the rising usage of medical imaging in clinical diagnosis, there is a need for a fast and secure method to share large number of medical images between healthcare practitioners, and compression has always been an option. is work analyses the Huffman coding compression method, one of the lossless compression techniques, as an alternative method to compress a DICOM file in open PACS settings. e idea of the Huffman coding compression method is to provide codeword with less number of bits for the symbol that has a higher value of byte frequency distribution. Experiments using different type of DICOM images are conducted, and the analysis on the performances in terms of compression ratio and compression/decompression time, as well as security, is provided. e experimental results showed that the Huffman coding technique has the capability to compress the DICOM file up to 1: 3.7010 ratio and up to 72.98% space savings. 1. Introduction DICOM (Digital Imaging and Communication in Medicine) is a file standard used to handle, store, print, and send in- formation in medical imaging. All modern medical imaging devices (imaging modalities) such as X-ray, CT (computed tomography) scan, and MRI (magnetic resonance imaging) use DICOM as their standardized file output. A DICOM file consists of a few data elements or attributes capable to store some information, such as patient data (name, sex, etc.), imaging procedure (calibration, radiation dose, contrast media, etc.), and the information of the image itself (reso- lution, pixel data, bit allocation, etc.) [1]. Due to its bigger size than the other standard sizes of the image file, the storage and transmission of the DICOM file become one of the problems in an integrated hospital information system (HIS) with picture archiving and communication system (PACS) implementation. e larger the size of the data, the more the storage media and bandwidth for the data transmission are required. It certainly causes the problem in terms of procurement cost for larger storage and bandwidth [2–4]. Data compression is one of the solutions to overcome this problem. Data compression is to convert the input data source into the output data that has a smaller size [5]. e main purpose of compression techniques is memory effi- ciency, fast compression, generation of the best output. It can be divided into two types, namely, lossless compression and lossy compression. Lossless compression is a type of data compression which does not remove any information from the initial data, while the lossy compression removes some of the information from the initial data [6]. Hindawi Journal of Healthcare Engineering Volume 2019, Article ID 5810540, 11 pages https://doi.org/10.1155/2019/5810540

Upload: others

Post on 13-Jun-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

Research ArticleAnalysis of DICOM Image Compression Alternative UsingHuffman Coding

Romi Fadillah Rahmat1 T S M Andreas1 Fahmi Fahmi2 Muhammad Fermi Pasha 3

Mohammed Yahya Alzahrani 4 and Rahmat Budiarto4

1Department of Information Technology Faculty of Computer Science and Information Technology Universitas Sumatera UtaraMedan 20155 Indonesia2Department of Electrical Engineering Faculty of Engineering Universitas Sumatera Utara Medan 20155 Indonesia3Malaysia School of Information Technology Monash University Bandar Sunway 47500 Malaysia4College of Computer Science and Information Technology Albaha University Al Bahah Saudi Arabia

Correspondence should be addressed to Muhammad Fermi Pasha muhammadfermipashamonashedu

Received 24 September 2018 Revised 14 March 2019 Accepted 9 April 2019 Published 17 June 2019

Academic Editor Angkoon Phinyomark

Copyright copy 2019 Romi Fadillah Rahmat et al )is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Compression in general aims to reduce file size with or without decreasing data quality of the original file Digital Imaging andCommunication in Medicine (DICOM) is a medical imaging file standard used to store multiple information such as patient dataimaging procedures and the image itself With the rising usage of medical imaging in clinical diagnosis there is a need for a fastand secure method to share large number of medical images between healthcare practitioners and compression has always beenan option )is work analyses the Huffman coding compression method one of the lossless compression techniques as analternative method to compress a DICOM file in open PACS settings )e idea of the Huffman coding compression method is toprovide codeword with less number of bits for the symbol that has a higher value of byte frequency distribution Experimentsusing different type of DICOM images are conducted and the analysis on the performances in terms of compression ratio andcompressiondecompression time as well as security is provided )e experimental results showed that the Huffman codingtechnique has the capability to compress the DICOM file up to 1 37010 ratio and up to 7298 space savings

1 Introduction

DICOM (Digital Imaging and Communication inMedicine)is a file standard used to handle store print and send in-formation in medical imaging All modern medical imagingdevices (imaging modalities) such as X-ray CT (computedtomography) scan and MRI (magnetic resonance imaging)use DICOM as their standardized file output A DICOM fileconsists of a few data elements or attributes capable to storesome information such as patient data (name sex etc)imaging procedure (calibration radiation dose contrastmedia etc) and the information of the image itself (reso-lution pixel data bit allocation etc) [1] Due to its biggersize than the other standard sizes of the image file thestorage and transmission of the DICOM file become one ofthe problems in an integrated hospital information system

(HIS) with picture archiving and communication system(PACS) implementation )e larger the size of the data themore the storage media and bandwidth for the datatransmission are required It certainly causes the problem interms of procurement cost for larger storage and bandwidth[2ndash4]

Data compression is one of the solutions to overcomethis problem Data compression is to convert the input datasource into the output data that has a smaller size [5] )emain purpose of compression techniques is memory effi-ciency fast compression generation of the best output Itcan be divided into two types namely lossless compressionand lossy compression Lossless compression is a type ofdata compression which does not remove any informationfrom the initial data while the lossy compression removessome of the information from the initial data [6]

HindawiJournal of Healthcare EngineeringVolume 2019 Article ID 5810540 11 pageshttpsdoiorg10115520195810540

Lossy data compression is usually used for generatinghigher compression ratio without considering the loss ofinformation in the image [7] )e latest research on lossydata compression was conducted by Kumar et al [8] whoused a logarithm method called LDCL (lossy data com-pression logarithm) in their methodology )eir experi-mental results showed that the particular method couldgenerate the compression ratio up to 1 60 in many cases

)e lossless JPEG2000 is the popular data compressionmethod used in various PACS and considered the standardfor DICOM compression [9] despite being not backwardcompatible [10] Nevertheless ongoing researches are stillbeing carried out to analyze the performance of JPEG2000compression method as well as proposing an alternativecompression method in PACS with the aim to balanceimage quality and transfer duration [9 11ndash15] )us thiswork implements and provides the performance analysis ofthe Huffman coding identified as one of the losslessstandard data compression methods by the US Food andDrug Administration (FDA) [16]

Existing work on Huffman coding adoption to compressthe DICOM image by Kavinder [17] did not address theperformance security aspect complexity and compressiontime for compressing the DICOM image file by consideringthe information stored in the file

2 Related Work

Huffman coding has been used for many cases of datacompression In 2015 Ezhilarasu et al [18] reviewedHuffman coding and concluded that the Huffman code canprovide better compression ratio space savings and averagebits than uncompressed data A comparative study wasperformed by Maan [19] in 2013 who analyzed and com-pared three lossless data compression codings namelyHuffman arithmetic and run length )e experimentalresults showed that arithmetic coding can generate highestcompression ratio among lossless data compression tech-niques but its compression speed is slower than theHuffman coding

Another related research work was done byMedeiros et alin 2014 [20] )ey compressed lightweight data for wirelesssensor networks (WSNs) by monitoring environmental pa-rameters by using low-resolution sensors )e obtained per-centage of the compression ratio in their experiment variedfrom 46 to 82 )e researchers stated that the Huffmancoding is extremely simple and outperforms lossless entropycompression (LEC) and adaptive linear filtering compression(ALFC) in most cases

Research has been conducted on DICOM image filecompression using various techniques In fact severalstudies combined lossless and lossy data compressiontechniques In 2013 Kavinder [17] combined Huffmancoding (lossless) and discrete cosine transform (lossy) andimproved the technique by using vector quantization toincrease the compression ratio In 2015 Kumar and Kumar[21] used hybrid techniques of discrete wavelet transform-discrete cosine transform (DWT-DCT) and Huffman cod-ing while Fahmi et al introduced sequential storage of

difference for image compressing in medical image cloudapplication [22 23] Other works on lossless and lossy datacompression techniques are found in [24ndash28]

3 Materials and Methods

In previous studies the lossy data compression techniquegenerates high compression ratio but decreases the qualitymetrics of the peak signal-to-noise ratio (PSNR) which isgenerally used to analyse the quality of an image )e higherthe PSNR is the better the quality of the compressed orreconstructed image is )us the lossless technique shouldbe applied for the enhancement of the compression of thesame PSNR An image can be compressed without the loss ofsignificant details through Huffman coding In the per-spective of a DICOM file we expect that the DICOM imagehas intact quality and metadata after its compression anddecompression Standard DICOM compression method isin JPEG2000 and thus we compare the performanceanalysis between JPEG2000 and Huffman coding as an al-ternative DICOM compression method )e lossless criteriaof Huffman coding are the foundation of this work Imagequality after decompression is a vital point here and is thereason for selecting Huffman coding as the methodology

Figure 1 shows the three parts of the methodology )efirst part is Huffman encoding for DICOM image file com-pression)eDICOM image file is collected first and used as asource )is part encodes (compresses) the file by calculatingthe byte frequency distribution (BFD) creating a prefix tree toget codewords changing byte distribution into codewordsand then performing bit padding if necessary)e second partis View which displays the DICOM image and includes thesteps for determining the image frame and window level andwidth and for resizing the image )e third part is Huffmandecoding for decompression In the decoding process a prefixtree is read from a compressed data flow file and codewordthreads are extracted from data flow and changed back intothe original distribution byte )e more detailed steps in themethodology are described below

31 Huffman Encode )e input DICOM file is compressedin the first step )e byte distribution on file is read andcalculated with BFD Next a prefix tree is created for theacquisition of codewords that will substitute the byte dis-tribution on the input file used for generating a new smallerfile )e data used for encoding are the byte distributionswhich are compiled in the DICOM file as shown in thefollowing (in hexadecimal)

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0044 49 43 4d 02 00 00 00 55 4c 04 00 be 00 00 0082 04 82 04 82 04 82 04 82 04 82 04 82 04 82 047d 04 78 04 78 04 78 04 78 04 78 04 69 04 5a 045a 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00

)e BFD which is a table storing the frequency of oc-currence value from every byte that compiles the file iscalculated for each DICOM file For example when the FF

2 Journal of Healthcare Engineering

byte (255) occurs 887 times in a le the value of the FF byte(255) in the BFD table is 288700 Table 1 shows the BFDfrom one of the DICOM les used in the experiment If theFF byte (255) occurs 200 times in a le then the value of theFF byte (255) in the BFD table is 200

Once the calculation of BFD is completed a prex tree iscreated for the acquisition of appropriate codewords assubstitutes of byte distribution [6] Table 2 shows the resultof the generated codewords after searching the prex treewhich was created on the basis of BFD in Table 1

Bit padding is the process of adding one or more bits intothe data ow to t into the minimum 8-bit computer ar-chitecture In the example when the generated data size is7 bit then 1-bit padding is required to fulll the 8 bit(1 bytes) and if the generated data size was 28 bit then 4-bitpadding is required to fulll the 32 bit (4 byte) and so on

32 View e View part comprises the features that arebuilt as the user interface is part shows to the user thecompressed DICOM image which is displayed in accor-dance with the original size or the size of the providedcanvas is part is called the DICOM Viewer in othersystems e View loads the image from the DICOM ledetermines the DICOM image frame to be displayed and the

Table 1 BFD for CT0011dcm le

Byte (hexadecimal) Byte (decimal) Frequency00 0 266167101 1 61302 2 72403 3 4965304 4 702819 FD 253 2FE 254 49043FF 255 887

Table 2 Codeword for byte distribution of CT0011dcm le

Byte (hexadecimal) Byte (decimal) Codeword00 0 101 1 01000001111002 2 01010011100103 3 01011104 4 00 FD 253 011111011100100011010FE 254 010101FF 255 011011001001

DICOMimage

Compressedfile

Calculate bytefrequency distribution

Create prefix treebased on calculationof byte distribution

frequency

Transform bytedistribution into

appropriate codeword

Traversing the prefixtree to get codewordin accordance withbyte distribution

Do bit padding if thelast codeword

distribution is notenough to create a

byte

Huffman encode

Load Image from DICOMfile

Determine DICOMrsquosimage frame to be

displayed

Resize the image if largerthan 512 x 512

Set window level andwindow width from the

image

Read the distribution ofpixels in the image and

displayed it in theprovided picture box

View Generate prefix treefrom the data flow

Construct lookuptable for symbols

based on generatedprefix tree

Change generatedcodeword into byte

distribution inaccordance with

created lookup table

Extract codewordsequence from the

data flow

Huffman decode

Figure 1 General architecture

Journal of Healthcare Engineering 3

window level and width to be configured resizes the imageand reads and displays the pixel distribution of the image)e View describes the metadata of the DICOM file

33 Huffman Decode )e Huffman decode decompressesthe file to be restored in its original file)e part starts with thereading of the data flow of the compressed file creation of aprefix tree from previously stored data into data flow in theencoding process and construction of a lookup table thatcontains codewords and symbols to be changed back into theoriginal byte structure before the compression [29] Howeverusing the bottom-up Huffman tree with probabilities [30] isgood in terms of run time however for the case of DICOMcompression we found that using a lookup table provide abalance between faster run time and memory usage

)e lookup table contains codewords and representssymbols generated from the results of the search of the prefixtree created previously )e table is used for changing thecodeword thread from the data flow back into the originalbyte distribution Table 3 shows the lookup table for thecompressed DICOM file CT0011dcm (CT0011huf)

After reading the codeword thread of the compressed filedata flow the generated codeword is changed into an originalbyte distribution or symbol according to the created lookuptable )is step is conducted by reading every bit of data flowthen determining whether the bit is in the lookup table If notthe next bit is read and threaded with the previous bit thenthe search is repeated in the lookup table If the thread of bit isfound as one of the codewords that represents a symbol thenthe bit is changed into the symbol )e process is donecontinuously until the number of the returned symbolsachieve the original data size

)e programming language used in this research isVBNET while the operating system used is Windows 7Ultimate 64 bit SP1 )e computer specification is Intel i5450M in processor unit 4 GB RAM 500GB HDD and ATIRadeon HD 5470 in graphic card

)e data used in this research work are several sets ofDICOM image files available at httpwwwmmntnet andanother set of the anonymized DICOM image from com-puted tomography is collected randomly from the Uni-versity Hospital RS USU Medan )e files used were 20DICOM image files with the extension lowastdcm )e speci-fications of all DICOM files are described in Table 4

4 Results and Discussion

41 Huffman Coding Compression Performances Tables 5and 6 present the results of DICOM file compressionthrough the Huffman coding technique )e specificationsare provided in Table 4 From the obtained result thepercentage of space savings is up to 7298 at a 1 37010compression ratio while the lowest space saving percentageis at minus008 )e worst compression ratio is 1 09992

One of the factors that affect the compression ratio is thenumber of nodes or symbols which creates a prefix tree ofthe image file )e tree is shown in CT-MONO2-16-ankledcm and CT-MONO2-16-braindcm files which

nearly have the same original size (plusmn525 kB) but have dif-ferent compression ratios )e CT-MONO2-16-ankledcmfile was twice as large as the CT-MONO2-16-braindcm filewith respect to the compression ratio

Compared to other image data sets in Table 6 theCT0013dcm file has a smaller compression ratio than that ofthe CT0014dcm file although the former had fewer sym-bols Hence another factor affects the value of the com-pression ratio apart from the number of symbols One suchfactor is BFD value of the file )e BFD values of theCT0013dcm and CT0014dcm file are shown in Table 7

)e BFD of the CT0013dcm file spreads more evenlythan that of the CT0014dcm file )is feature causes thelength of the codeword from the former to exceed that of thelatter For instance if the assumption for the length ofcodewords for byte 00 is 2 byte 03 is 5 and byte FE is 6 thenthe obtained size of CT0013dcm file when other bytes aredisregarded is as follows

(663477 times 2 + 24806 times 5

+ 24489 times 6) 1597918 bit asymp 199740 byte(1)

while the obtained size of CT0014dcm file is as follows

Table 3 Lookup table for CT0011HUF file

Symbol Codeword0 11 0100000111102 0101001110013 0101114 00

253 011111011100100011010254 010101255 011011001001

Table 4 DICOM file specification used in the experiments

File name Number of frames Size (byte)CT0011dcm 8 4202378CT0012dcm 2 1052902CT0013dcm 2 1052750CT0014dcm 2 1053770CT0031dcm 15 7871216CT0032dcm 1 527992CT0033dcm 7 3675976CT0034dcm 1 528012CT0035dcm 6 3149976CT0051dcm 1 208402CT0052dcm 1 259602CT0055dcm 1 208416CT0056dcm 1 259616CT0059dcm 1 362378CT0081dcm 2 2607730CT0110dcm 9 4725954CT-MONO2-8-abdodcm 1 262940CT-MONO2-16-ankledcm 1 525436CT-MONO2-16-braindcm 1 525968CT-MONO2-16-chestdcm 1 145136

4 Journal of Healthcare Engineering

(698830 times 2 + 14 times 5

+ 91 times 6) 1398276 bit asymp 174785 byte(2)

In the experimental results we also observe a limitationof Huffman coding where one case generates a compressionratio of less than 1 (ratiolt 1) causing the generated com-pressed file size is larger than the original file size

Table 8 shows the BFD pattern and the codeword fromCT-MONO2-16-chestdcm file which has 1 0992 of com-pression ratio )e BFD value is nearly evenly distributed and

thus causes the created prefix tree to generate codewordswhich are approximately equal to or longer than the re-quired bit length for the creation of 1 byte (8 bit) For the

Table 5 Compression results

File name Original size (byte) No of symbolnode Compression time (s) Compressed size (byte)CT0011dcm 4202378 254 275 1371932CT0012dcm 1052902 223 067 346880CT0013dcm 1052750 209 060 328241CT0014dcm 1053770 254 065 284727CT0031dcm 7871216 256 1725 5057232CT0032dcm 527992 256 065 322986CT0033dcm 3675976 256 650 2592421CT0034dcm 528012 256 078 380848CT0035dcm 3149976 256 302 1395825CT0051dcm 208402 256 024 118713CT0052dcm 259602 256 029 145620CT0055dcm 208416 256 026 127395CT0056dcm 259616 256 031 164952CT0059dcm 362378 256 036 193416CT0081dcm 2607730 256 430 1882801CT0110dcm 4725954 256 887 3196659CT-MONO2-8-abdodcm 262940 217 023 124563CT-MONO2-16-ankledcm 525436 89 029 175696CT-MONO2-16-braindcm 525968 256 061 360802CT-MONO2-16-chestdcm 145136 256 028 145248

Table 6 Compression ratio value and space savings

File name Compressionratio Space savings ()

CT0011dcm 30631 6735CT0012dcm 30353 6705CT0013dcm 32072 6882CT0014dcm 37010 7298CT0031dcm 15564 3575CT0032dcm 16347 3883CT0033dcm 14180 2948CT0034dcm 13864 2787CT0035dcm 22567 5569CT0051dcm 17555 4304CT0052dcm 17827 4391CT0055dcm 16360 3887CT0056dcm 15739 3646CT0059dcm 18736 4663CT0081dcm 13850 2780CT0110dcm 14784 3236CT-MONO2-8-abdodcm 21109 5263CT-MONO2-16-ankledcm 29906 6656

CT-MONO2-16-braindcm 14578 3140

CT-MONO2-16-chestdcm 09992 minus008

Table 7 BFD for CT0013dcm and CT0014dcm file

Byte (hexadecimal) CT0013dcm CT0014dcm00 663477 69883001 103 17502 237 5903 24806 14

11 5178 4512 4703 1813 4719 1314 6152 6

FC 3 6FD 2 5FE 24489 91FF 413 682

Table 8 BFD and codeword for CT-MONO2-16-chestdcm file

Byte (hexadecimal) BFD Codeword00 938 001011001 495 0100100002 532 0110011103 495 01001001

7D 381 1111111107E 472 001100117F 398 0000001180 222 001001110

FC 266 100000000FD 307 101001010FE 210 1011100011FF 117 1011100010

Journal of Healthcare Engineering 5

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 2: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

Lossy data compression is usually used for generatinghigher compression ratio without considering the loss ofinformation in the image [7] )e latest research on lossydata compression was conducted by Kumar et al [8] whoused a logarithm method called LDCL (lossy data com-pression logarithm) in their methodology )eir experi-mental results showed that the particular method couldgenerate the compression ratio up to 1 60 in many cases

)e lossless JPEG2000 is the popular data compressionmethod used in various PACS and considered the standardfor DICOM compression [9] despite being not backwardcompatible [10] Nevertheless ongoing researches are stillbeing carried out to analyze the performance of JPEG2000compression method as well as proposing an alternativecompression method in PACS with the aim to balanceimage quality and transfer duration [9 11ndash15] )us thiswork implements and provides the performance analysis ofthe Huffman coding identified as one of the losslessstandard data compression methods by the US Food andDrug Administration (FDA) [16]

Existing work on Huffman coding adoption to compressthe DICOM image by Kavinder [17] did not address theperformance security aspect complexity and compressiontime for compressing the DICOM image file by consideringthe information stored in the file

2 Related Work

Huffman coding has been used for many cases of datacompression In 2015 Ezhilarasu et al [18] reviewedHuffman coding and concluded that the Huffman code canprovide better compression ratio space savings and averagebits than uncompressed data A comparative study wasperformed by Maan [19] in 2013 who analyzed and com-pared three lossless data compression codings namelyHuffman arithmetic and run length )e experimentalresults showed that arithmetic coding can generate highestcompression ratio among lossless data compression tech-niques but its compression speed is slower than theHuffman coding

Another related research work was done byMedeiros et alin 2014 [20] )ey compressed lightweight data for wirelesssensor networks (WSNs) by monitoring environmental pa-rameters by using low-resolution sensors )e obtained per-centage of the compression ratio in their experiment variedfrom 46 to 82 )e researchers stated that the Huffmancoding is extremely simple and outperforms lossless entropycompression (LEC) and adaptive linear filtering compression(ALFC) in most cases

Research has been conducted on DICOM image filecompression using various techniques In fact severalstudies combined lossless and lossy data compressiontechniques In 2013 Kavinder [17] combined Huffmancoding (lossless) and discrete cosine transform (lossy) andimproved the technique by using vector quantization toincrease the compression ratio In 2015 Kumar and Kumar[21] used hybrid techniques of discrete wavelet transform-discrete cosine transform (DWT-DCT) and Huffman cod-ing while Fahmi et al introduced sequential storage of

difference for image compressing in medical image cloudapplication [22 23] Other works on lossless and lossy datacompression techniques are found in [24ndash28]

3 Materials and Methods

In previous studies the lossy data compression techniquegenerates high compression ratio but decreases the qualitymetrics of the peak signal-to-noise ratio (PSNR) which isgenerally used to analyse the quality of an image )e higherthe PSNR is the better the quality of the compressed orreconstructed image is )us the lossless technique shouldbe applied for the enhancement of the compression of thesame PSNR An image can be compressed without the loss ofsignificant details through Huffman coding In the per-spective of a DICOM file we expect that the DICOM imagehas intact quality and metadata after its compression anddecompression Standard DICOM compression method isin JPEG2000 and thus we compare the performanceanalysis between JPEG2000 and Huffman coding as an al-ternative DICOM compression method )e lossless criteriaof Huffman coding are the foundation of this work Imagequality after decompression is a vital point here and is thereason for selecting Huffman coding as the methodology

Figure 1 shows the three parts of the methodology )efirst part is Huffman encoding for DICOM image file com-pression)eDICOM image file is collected first and used as asource )is part encodes (compresses) the file by calculatingthe byte frequency distribution (BFD) creating a prefix tree toget codewords changing byte distribution into codewordsand then performing bit padding if necessary)e second partis View which displays the DICOM image and includes thesteps for determining the image frame and window level andwidth and for resizing the image )e third part is Huffmandecoding for decompression In the decoding process a prefixtree is read from a compressed data flow file and codewordthreads are extracted from data flow and changed back intothe original distribution byte )e more detailed steps in themethodology are described below

31 Huffman Encode )e input DICOM file is compressedin the first step )e byte distribution on file is read andcalculated with BFD Next a prefix tree is created for theacquisition of codewords that will substitute the byte dis-tribution on the input file used for generating a new smallerfile )e data used for encoding are the byte distributionswhich are compiled in the DICOM file as shown in thefollowing (in hexadecimal)

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0044 49 43 4d 02 00 00 00 55 4c 04 00 be 00 00 0082 04 82 04 82 04 82 04 82 04 82 04 82 04 82 047d 04 78 04 78 04 78 04 78 04 78 04 69 04 5a 045a 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00

)e BFD which is a table storing the frequency of oc-currence value from every byte that compiles the file iscalculated for each DICOM file For example when the FF

2 Journal of Healthcare Engineering

byte (255) occurs 887 times in a le the value of the FF byte(255) in the BFD table is 288700 Table 1 shows the BFDfrom one of the DICOM les used in the experiment If theFF byte (255) occurs 200 times in a le then the value of theFF byte (255) in the BFD table is 200

Once the calculation of BFD is completed a prex tree iscreated for the acquisition of appropriate codewords assubstitutes of byte distribution [6] Table 2 shows the resultof the generated codewords after searching the prex treewhich was created on the basis of BFD in Table 1

Bit padding is the process of adding one or more bits intothe data ow to t into the minimum 8-bit computer ar-chitecture In the example when the generated data size is7 bit then 1-bit padding is required to fulll the 8 bit(1 bytes) and if the generated data size was 28 bit then 4-bitpadding is required to fulll the 32 bit (4 byte) and so on

32 View e View part comprises the features that arebuilt as the user interface is part shows to the user thecompressed DICOM image which is displayed in accor-dance with the original size or the size of the providedcanvas is part is called the DICOM Viewer in othersystems e View loads the image from the DICOM ledetermines the DICOM image frame to be displayed and the

Table 1 BFD for CT0011dcm le

Byte (hexadecimal) Byte (decimal) Frequency00 0 266167101 1 61302 2 72403 3 4965304 4 702819 FD 253 2FE 254 49043FF 255 887

Table 2 Codeword for byte distribution of CT0011dcm le

Byte (hexadecimal) Byte (decimal) Codeword00 0 101 1 01000001111002 2 01010011100103 3 01011104 4 00 FD 253 011111011100100011010FE 254 010101FF 255 011011001001

DICOMimage

Compressedfile

Calculate bytefrequency distribution

Create prefix treebased on calculationof byte distribution

frequency

Transform bytedistribution into

appropriate codeword

Traversing the prefixtree to get codewordin accordance withbyte distribution

Do bit padding if thelast codeword

distribution is notenough to create a

byte

Huffman encode

Load Image from DICOMfile

Determine DICOMrsquosimage frame to be

displayed

Resize the image if largerthan 512 x 512

Set window level andwindow width from the

image

Read the distribution ofpixels in the image and

displayed it in theprovided picture box

View Generate prefix treefrom the data flow

Construct lookuptable for symbols

based on generatedprefix tree

Change generatedcodeword into byte

distribution inaccordance with

created lookup table

Extract codewordsequence from the

data flow

Huffman decode

Figure 1 General architecture

Journal of Healthcare Engineering 3

window level and width to be configured resizes the imageand reads and displays the pixel distribution of the image)e View describes the metadata of the DICOM file

33 Huffman Decode )e Huffman decode decompressesthe file to be restored in its original file)e part starts with thereading of the data flow of the compressed file creation of aprefix tree from previously stored data into data flow in theencoding process and construction of a lookup table thatcontains codewords and symbols to be changed back into theoriginal byte structure before the compression [29] Howeverusing the bottom-up Huffman tree with probabilities [30] isgood in terms of run time however for the case of DICOMcompression we found that using a lookup table provide abalance between faster run time and memory usage

)e lookup table contains codewords and representssymbols generated from the results of the search of the prefixtree created previously )e table is used for changing thecodeword thread from the data flow back into the originalbyte distribution Table 3 shows the lookup table for thecompressed DICOM file CT0011dcm (CT0011huf)

After reading the codeword thread of the compressed filedata flow the generated codeword is changed into an originalbyte distribution or symbol according to the created lookuptable )is step is conducted by reading every bit of data flowthen determining whether the bit is in the lookup table If notthe next bit is read and threaded with the previous bit thenthe search is repeated in the lookup table If the thread of bit isfound as one of the codewords that represents a symbol thenthe bit is changed into the symbol )e process is donecontinuously until the number of the returned symbolsachieve the original data size

)e programming language used in this research isVBNET while the operating system used is Windows 7Ultimate 64 bit SP1 )e computer specification is Intel i5450M in processor unit 4 GB RAM 500GB HDD and ATIRadeon HD 5470 in graphic card

)e data used in this research work are several sets ofDICOM image files available at httpwwwmmntnet andanother set of the anonymized DICOM image from com-puted tomography is collected randomly from the Uni-versity Hospital RS USU Medan )e files used were 20DICOM image files with the extension lowastdcm )e speci-fications of all DICOM files are described in Table 4

4 Results and Discussion

41 Huffman Coding Compression Performances Tables 5and 6 present the results of DICOM file compressionthrough the Huffman coding technique )e specificationsare provided in Table 4 From the obtained result thepercentage of space savings is up to 7298 at a 1 37010compression ratio while the lowest space saving percentageis at minus008 )e worst compression ratio is 1 09992

One of the factors that affect the compression ratio is thenumber of nodes or symbols which creates a prefix tree ofthe image file )e tree is shown in CT-MONO2-16-ankledcm and CT-MONO2-16-braindcm files which

nearly have the same original size (plusmn525 kB) but have dif-ferent compression ratios )e CT-MONO2-16-ankledcmfile was twice as large as the CT-MONO2-16-braindcm filewith respect to the compression ratio

Compared to other image data sets in Table 6 theCT0013dcm file has a smaller compression ratio than that ofthe CT0014dcm file although the former had fewer sym-bols Hence another factor affects the value of the com-pression ratio apart from the number of symbols One suchfactor is BFD value of the file )e BFD values of theCT0013dcm and CT0014dcm file are shown in Table 7

)e BFD of the CT0013dcm file spreads more evenlythan that of the CT0014dcm file )is feature causes thelength of the codeword from the former to exceed that of thelatter For instance if the assumption for the length ofcodewords for byte 00 is 2 byte 03 is 5 and byte FE is 6 thenthe obtained size of CT0013dcm file when other bytes aredisregarded is as follows

(663477 times 2 + 24806 times 5

+ 24489 times 6) 1597918 bit asymp 199740 byte(1)

while the obtained size of CT0014dcm file is as follows

Table 3 Lookup table for CT0011HUF file

Symbol Codeword0 11 0100000111102 0101001110013 0101114 00

253 011111011100100011010254 010101255 011011001001

Table 4 DICOM file specification used in the experiments

File name Number of frames Size (byte)CT0011dcm 8 4202378CT0012dcm 2 1052902CT0013dcm 2 1052750CT0014dcm 2 1053770CT0031dcm 15 7871216CT0032dcm 1 527992CT0033dcm 7 3675976CT0034dcm 1 528012CT0035dcm 6 3149976CT0051dcm 1 208402CT0052dcm 1 259602CT0055dcm 1 208416CT0056dcm 1 259616CT0059dcm 1 362378CT0081dcm 2 2607730CT0110dcm 9 4725954CT-MONO2-8-abdodcm 1 262940CT-MONO2-16-ankledcm 1 525436CT-MONO2-16-braindcm 1 525968CT-MONO2-16-chestdcm 1 145136

4 Journal of Healthcare Engineering

(698830 times 2 + 14 times 5

+ 91 times 6) 1398276 bit asymp 174785 byte(2)

In the experimental results we also observe a limitationof Huffman coding where one case generates a compressionratio of less than 1 (ratiolt 1) causing the generated com-pressed file size is larger than the original file size

Table 8 shows the BFD pattern and the codeword fromCT-MONO2-16-chestdcm file which has 1 0992 of com-pression ratio )e BFD value is nearly evenly distributed and

thus causes the created prefix tree to generate codewordswhich are approximately equal to or longer than the re-quired bit length for the creation of 1 byte (8 bit) For the

Table 5 Compression results

File name Original size (byte) No of symbolnode Compression time (s) Compressed size (byte)CT0011dcm 4202378 254 275 1371932CT0012dcm 1052902 223 067 346880CT0013dcm 1052750 209 060 328241CT0014dcm 1053770 254 065 284727CT0031dcm 7871216 256 1725 5057232CT0032dcm 527992 256 065 322986CT0033dcm 3675976 256 650 2592421CT0034dcm 528012 256 078 380848CT0035dcm 3149976 256 302 1395825CT0051dcm 208402 256 024 118713CT0052dcm 259602 256 029 145620CT0055dcm 208416 256 026 127395CT0056dcm 259616 256 031 164952CT0059dcm 362378 256 036 193416CT0081dcm 2607730 256 430 1882801CT0110dcm 4725954 256 887 3196659CT-MONO2-8-abdodcm 262940 217 023 124563CT-MONO2-16-ankledcm 525436 89 029 175696CT-MONO2-16-braindcm 525968 256 061 360802CT-MONO2-16-chestdcm 145136 256 028 145248

Table 6 Compression ratio value and space savings

File name Compressionratio Space savings ()

CT0011dcm 30631 6735CT0012dcm 30353 6705CT0013dcm 32072 6882CT0014dcm 37010 7298CT0031dcm 15564 3575CT0032dcm 16347 3883CT0033dcm 14180 2948CT0034dcm 13864 2787CT0035dcm 22567 5569CT0051dcm 17555 4304CT0052dcm 17827 4391CT0055dcm 16360 3887CT0056dcm 15739 3646CT0059dcm 18736 4663CT0081dcm 13850 2780CT0110dcm 14784 3236CT-MONO2-8-abdodcm 21109 5263CT-MONO2-16-ankledcm 29906 6656

CT-MONO2-16-braindcm 14578 3140

CT-MONO2-16-chestdcm 09992 minus008

Table 7 BFD for CT0013dcm and CT0014dcm file

Byte (hexadecimal) CT0013dcm CT0014dcm00 663477 69883001 103 17502 237 5903 24806 14

11 5178 4512 4703 1813 4719 1314 6152 6

FC 3 6FD 2 5FE 24489 91FF 413 682

Table 8 BFD and codeword for CT-MONO2-16-chestdcm file

Byte (hexadecimal) BFD Codeword00 938 001011001 495 0100100002 532 0110011103 495 01001001

7D 381 1111111107E 472 001100117F 398 0000001180 222 001001110

FC 266 100000000FD 307 101001010FE 210 1011100011FF 117 1011100010

Journal of Healthcare Engineering 5

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 3: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

byte (255) occurs 887 times in a le the value of the FF byte(255) in the BFD table is 288700 Table 1 shows the BFDfrom one of the DICOM les used in the experiment If theFF byte (255) occurs 200 times in a le then the value of theFF byte (255) in the BFD table is 200

Once the calculation of BFD is completed a prex tree iscreated for the acquisition of appropriate codewords assubstitutes of byte distribution [6] Table 2 shows the resultof the generated codewords after searching the prex treewhich was created on the basis of BFD in Table 1

Bit padding is the process of adding one or more bits intothe data ow to t into the minimum 8-bit computer ar-chitecture In the example when the generated data size is7 bit then 1-bit padding is required to fulll the 8 bit(1 bytes) and if the generated data size was 28 bit then 4-bitpadding is required to fulll the 32 bit (4 byte) and so on

32 View e View part comprises the features that arebuilt as the user interface is part shows to the user thecompressed DICOM image which is displayed in accor-dance with the original size or the size of the providedcanvas is part is called the DICOM Viewer in othersystems e View loads the image from the DICOM ledetermines the DICOM image frame to be displayed and the

Table 1 BFD for CT0011dcm le

Byte (hexadecimal) Byte (decimal) Frequency00 0 266167101 1 61302 2 72403 3 4965304 4 702819 FD 253 2FE 254 49043FF 255 887

Table 2 Codeword for byte distribution of CT0011dcm le

Byte (hexadecimal) Byte (decimal) Codeword00 0 101 1 01000001111002 2 01010011100103 3 01011104 4 00 FD 253 011111011100100011010FE 254 010101FF 255 011011001001

DICOMimage

Compressedfile

Calculate bytefrequency distribution

Create prefix treebased on calculationof byte distribution

frequency

Transform bytedistribution into

appropriate codeword

Traversing the prefixtree to get codewordin accordance withbyte distribution

Do bit padding if thelast codeword

distribution is notenough to create a

byte

Huffman encode

Load Image from DICOMfile

Determine DICOMrsquosimage frame to be

displayed

Resize the image if largerthan 512 x 512

Set window level andwindow width from the

image

Read the distribution ofpixels in the image and

displayed it in theprovided picture box

View Generate prefix treefrom the data flow

Construct lookuptable for symbols

based on generatedprefix tree

Change generatedcodeword into byte

distribution inaccordance with

created lookup table

Extract codewordsequence from the

data flow

Huffman decode

Figure 1 General architecture

Journal of Healthcare Engineering 3

window level and width to be configured resizes the imageand reads and displays the pixel distribution of the image)e View describes the metadata of the DICOM file

33 Huffman Decode )e Huffman decode decompressesthe file to be restored in its original file)e part starts with thereading of the data flow of the compressed file creation of aprefix tree from previously stored data into data flow in theencoding process and construction of a lookup table thatcontains codewords and symbols to be changed back into theoriginal byte structure before the compression [29] Howeverusing the bottom-up Huffman tree with probabilities [30] isgood in terms of run time however for the case of DICOMcompression we found that using a lookup table provide abalance between faster run time and memory usage

)e lookup table contains codewords and representssymbols generated from the results of the search of the prefixtree created previously )e table is used for changing thecodeword thread from the data flow back into the originalbyte distribution Table 3 shows the lookup table for thecompressed DICOM file CT0011dcm (CT0011huf)

After reading the codeword thread of the compressed filedata flow the generated codeword is changed into an originalbyte distribution or symbol according to the created lookuptable )is step is conducted by reading every bit of data flowthen determining whether the bit is in the lookup table If notthe next bit is read and threaded with the previous bit thenthe search is repeated in the lookup table If the thread of bit isfound as one of the codewords that represents a symbol thenthe bit is changed into the symbol )e process is donecontinuously until the number of the returned symbolsachieve the original data size

)e programming language used in this research isVBNET while the operating system used is Windows 7Ultimate 64 bit SP1 )e computer specification is Intel i5450M in processor unit 4 GB RAM 500GB HDD and ATIRadeon HD 5470 in graphic card

)e data used in this research work are several sets ofDICOM image files available at httpwwwmmntnet andanother set of the anonymized DICOM image from com-puted tomography is collected randomly from the Uni-versity Hospital RS USU Medan )e files used were 20DICOM image files with the extension lowastdcm )e speci-fications of all DICOM files are described in Table 4

4 Results and Discussion

41 Huffman Coding Compression Performances Tables 5and 6 present the results of DICOM file compressionthrough the Huffman coding technique )e specificationsare provided in Table 4 From the obtained result thepercentage of space savings is up to 7298 at a 1 37010compression ratio while the lowest space saving percentageis at minus008 )e worst compression ratio is 1 09992

One of the factors that affect the compression ratio is thenumber of nodes or symbols which creates a prefix tree ofthe image file )e tree is shown in CT-MONO2-16-ankledcm and CT-MONO2-16-braindcm files which

nearly have the same original size (plusmn525 kB) but have dif-ferent compression ratios )e CT-MONO2-16-ankledcmfile was twice as large as the CT-MONO2-16-braindcm filewith respect to the compression ratio

Compared to other image data sets in Table 6 theCT0013dcm file has a smaller compression ratio than that ofthe CT0014dcm file although the former had fewer sym-bols Hence another factor affects the value of the com-pression ratio apart from the number of symbols One suchfactor is BFD value of the file )e BFD values of theCT0013dcm and CT0014dcm file are shown in Table 7

)e BFD of the CT0013dcm file spreads more evenlythan that of the CT0014dcm file )is feature causes thelength of the codeword from the former to exceed that of thelatter For instance if the assumption for the length ofcodewords for byte 00 is 2 byte 03 is 5 and byte FE is 6 thenthe obtained size of CT0013dcm file when other bytes aredisregarded is as follows

(663477 times 2 + 24806 times 5

+ 24489 times 6) 1597918 bit asymp 199740 byte(1)

while the obtained size of CT0014dcm file is as follows

Table 3 Lookup table for CT0011HUF file

Symbol Codeword0 11 0100000111102 0101001110013 0101114 00

253 011111011100100011010254 010101255 011011001001

Table 4 DICOM file specification used in the experiments

File name Number of frames Size (byte)CT0011dcm 8 4202378CT0012dcm 2 1052902CT0013dcm 2 1052750CT0014dcm 2 1053770CT0031dcm 15 7871216CT0032dcm 1 527992CT0033dcm 7 3675976CT0034dcm 1 528012CT0035dcm 6 3149976CT0051dcm 1 208402CT0052dcm 1 259602CT0055dcm 1 208416CT0056dcm 1 259616CT0059dcm 1 362378CT0081dcm 2 2607730CT0110dcm 9 4725954CT-MONO2-8-abdodcm 1 262940CT-MONO2-16-ankledcm 1 525436CT-MONO2-16-braindcm 1 525968CT-MONO2-16-chestdcm 1 145136

4 Journal of Healthcare Engineering

(698830 times 2 + 14 times 5

+ 91 times 6) 1398276 bit asymp 174785 byte(2)

In the experimental results we also observe a limitationof Huffman coding where one case generates a compressionratio of less than 1 (ratiolt 1) causing the generated com-pressed file size is larger than the original file size

Table 8 shows the BFD pattern and the codeword fromCT-MONO2-16-chestdcm file which has 1 0992 of com-pression ratio )e BFD value is nearly evenly distributed and

thus causes the created prefix tree to generate codewordswhich are approximately equal to or longer than the re-quired bit length for the creation of 1 byte (8 bit) For the

Table 5 Compression results

File name Original size (byte) No of symbolnode Compression time (s) Compressed size (byte)CT0011dcm 4202378 254 275 1371932CT0012dcm 1052902 223 067 346880CT0013dcm 1052750 209 060 328241CT0014dcm 1053770 254 065 284727CT0031dcm 7871216 256 1725 5057232CT0032dcm 527992 256 065 322986CT0033dcm 3675976 256 650 2592421CT0034dcm 528012 256 078 380848CT0035dcm 3149976 256 302 1395825CT0051dcm 208402 256 024 118713CT0052dcm 259602 256 029 145620CT0055dcm 208416 256 026 127395CT0056dcm 259616 256 031 164952CT0059dcm 362378 256 036 193416CT0081dcm 2607730 256 430 1882801CT0110dcm 4725954 256 887 3196659CT-MONO2-8-abdodcm 262940 217 023 124563CT-MONO2-16-ankledcm 525436 89 029 175696CT-MONO2-16-braindcm 525968 256 061 360802CT-MONO2-16-chestdcm 145136 256 028 145248

Table 6 Compression ratio value and space savings

File name Compressionratio Space savings ()

CT0011dcm 30631 6735CT0012dcm 30353 6705CT0013dcm 32072 6882CT0014dcm 37010 7298CT0031dcm 15564 3575CT0032dcm 16347 3883CT0033dcm 14180 2948CT0034dcm 13864 2787CT0035dcm 22567 5569CT0051dcm 17555 4304CT0052dcm 17827 4391CT0055dcm 16360 3887CT0056dcm 15739 3646CT0059dcm 18736 4663CT0081dcm 13850 2780CT0110dcm 14784 3236CT-MONO2-8-abdodcm 21109 5263CT-MONO2-16-ankledcm 29906 6656

CT-MONO2-16-braindcm 14578 3140

CT-MONO2-16-chestdcm 09992 minus008

Table 7 BFD for CT0013dcm and CT0014dcm file

Byte (hexadecimal) CT0013dcm CT0014dcm00 663477 69883001 103 17502 237 5903 24806 14

11 5178 4512 4703 1813 4719 1314 6152 6

FC 3 6FD 2 5FE 24489 91FF 413 682

Table 8 BFD and codeword for CT-MONO2-16-chestdcm file

Byte (hexadecimal) BFD Codeword00 938 001011001 495 0100100002 532 0110011103 495 01001001

7D 381 1111111107E 472 001100117F 398 0000001180 222 001001110

FC 266 100000000FD 307 101001010FE 210 1011100011FF 117 1011100010

Journal of Healthcare Engineering 5

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 4: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

window level and width to be configured resizes the imageand reads and displays the pixel distribution of the image)e View describes the metadata of the DICOM file

33 Huffman Decode )e Huffman decode decompressesthe file to be restored in its original file)e part starts with thereading of the data flow of the compressed file creation of aprefix tree from previously stored data into data flow in theencoding process and construction of a lookup table thatcontains codewords and symbols to be changed back into theoriginal byte structure before the compression [29] Howeverusing the bottom-up Huffman tree with probabilities [30] isgood in terms of run time however for the case of DICOMcompression we found that using a lookup table provide abalance between faster run time and memory usage

)e lookup table contains codewords and representssymbols generated from the results of the search of the prefixtree created previously )e table is used for changing thecodeword thread from the data flow back into the originalbyte distribution Table 3 shows the lookup table for thecompressed DICOM file CT0011dcm (CT0011huf)

After reading the codeword thread of the compressed filedata flow the generated codeword is changed into an originalbyte distribution or symbol according to the created lookuptable )is step is conducted by reading every bit of data flowthen determining whether the bit is in the lookup table If notthe next bit is read and threaded with the previous bit thenthe search is repeated in the lookup table If the thread of bit isfound as one of the codewords that represents a symbol thenthe bit is changed into the symbol )e process is donecontinuously until the number of the returned symbolsachieve the original data size

)e programming language used in this research isVBNET while the operating system used is Windows 7Ultimate 64 bit SP1 )e computer specification is Intel i5450M in processor unit 4 GB RAM 500GB HDD and ATIRadeon HD 5470 in graphic card

)e data used in this research work are several sets ofDICOM image files available at httpwwwmmntnet andanother set of the anonymized DICOM image from com-puted tomography is collected randomly from the Uni-versity Hospital RS USU Medan )e files used were 20DICOM image files with the extension lowastdcm )e speci-fications of all DICOM files are described in Table 4

4 Results and Discussion

41 Huffman Coding Compression Performances Tables 5and 6 present the results of DICOM file compressionthrough the Huffman coding technique )e specificationsare provided in Table 4 From the obtained result thepercentage of space savings is up to 7298 at a 1 37010compression ratio while the lowest space saving percentageis at minus008 )e worst compression ratio is 1 09992

One of the factors that affect the compression ratio is thenumber of nodes or symbols which creates a prefix tree ofthe image file )e tree is shown in CT-MONO2-16-ankledcm and CT-MONO2-16-braindcm files which

nearly have the same original size (plusmn525 kB) but have dif-ferent compression ratios )e CT-MONO2-16-ankledcmfile was twice as large as the CT-MONO2-16-braindcm filewith respect to the compression ratio

Compared to other image data sets in Table 6 theCT0013dcm file has a smaller compression ratio than that ofthe CT0014dcm file although the former had fewer sym-bols Hence another factor affects the value of the com-pression ratio apart from the number of symbols One suchfactor is BFD value of the file )e BFD values of theCT0013dcm and CT0014dcm file are shown in Table 7

)e BFD of the CT0013dcm file spreads more evenlythan that of the CT0014dcm file )is feature causes thelength of the codeword from the former to exceed that of thelatter For instance if the assumption for the length ofcodewords for byte 00 is 2 byte 03 is 5 and byte FE is 6 thenthe obtained size of CT0013dcm file when other bytes aredisregarded is as follows

(663477 times 2 + 24806 times 5

+ 24489 times 6) 1597918 bit asymp 199740 byte(1)

while the obtained size of CT0014dcm file is as follows

Table 3 Lookup table for CT0011HUF file

Symbol Codeword0 11 0100000111102 0101001110013 0101114 00

253 011111011100100011010254 010101255 011011001001

Table 4 DICOM file specification used in the experiments

File name Number of frames Size (byte)CT0011dcm 8 4202378CT0012dcm 2 1052902CT0013dcm 2 1052750CT0014dcm 2 1053770CT0031dcm 15 7871216CT0032dcm 1 527992CT0033dcm 7 3675976CT0034dcm 1 528012CT0035dcm 6 3149976CT0051dcm 1 208402CT0052dcm 1 259602CT0055dcm 1 208416CT0056dcm 1 259616CT0059dcm 1 362378CT0081dcm 2 2607730CT0110dcm 9 4725954CT-MONO2-8-abdodcm 1 262940CT-MONO2-16-ankledcm 1 525436CT-MONO2-16-braindcm 1 525968CT-MONO2-16-chestdcm 1 145136

4 Journal of Healthcare Engineering

(698830 times 2 + 14 times 5

+ 91 times 6) 1398276 bit asymp 174785 byte(2)

In the experimental results we also observe a limitationof Huffman coding where one case generates a compressionratio of less than 1 (ratiolt 1) causing the generated com-pressed file size is larger than the original file size

Table 8 shows the BFD pattern and the codeword fromCT-MONO2-16-chestdcm file which has 1 0992 of com-pression ratio )e BFD value is nearly evenly distributed and

thus causes the created prefix tree to generate codewordswhich are approximately equal to or longer than the re-quired bit length for the creation of 1 byte (8 bit) For the

Table 5 Compression results

File name Original size (byte) No of symbolnode Compression time (s) Compressed size (byte)CT0011dcm 4202378 254 275 1371932CT0012dcm 1052902 223 067 346880CT0013dcm 1052750 209 060 328241CT0014dcm 1053770 254 065 284727CT0031dcm 7871216 256 1725 5057232CT0032dcm 527992 256 065 322986CT0033dcm 3675976 256 650 2592421CT0034dcm 528012 256 078 380848CT0035dcm 3149976 256 302 1395825CT0051dcm 208402 256 024 118713CT0052dcm 259602 256 029 145620CT0055dcm 208416 256 026 127395CT0056dcm 259616 256 031 164952CT0059dcm 362378 256 036 193416CT0081dcm 2607730 256 430 1882801CT0110dcm 4725954 256 887 3196659CT-MONO2-8-abdodcm 262940 217 023 124563CT-MONO2-16-ankledcm 525436 89 029 175696CT-MONO2-16-braindcm 525968 256 061 360802CT-MONO2-16-chestdcm 145136 256 028 145248

Table 6 Compression ratio value and space savings

File name Compressionratio Space savings ()

CT0011dcm 30631 6735CT0012dcm 30353 6705CT0013dcm 32072 6882CT0014dcm 37010 7298CT0031dcm 15564 3575CT0032dcm 16347 3883CT0033dcm 14180 2948CT0034dcm 13864 2787CT0035dcm 22567 5569CT0051dcm 17555 4304CT0052dcm 17827 4391CT0055dcm 16360 3887CT0056dcm 15739 3646CT0059dcm 18736 4663CT0081dcm 13850 2780CT0110dcm 14784 3236CT-MONO2-8-abdodcm 21109 5263CT-MONO2-16-ankledcm 29906 6656

CT-MONO2-16-braindcm 14578 3140

CT-MONO2-16-chestdcm 09992 minus008

Table 7 BFD for CT0013dcm and CT0014dcm file

Byte (hexadecimal) CT0013dcm CT0014dcm00 663477 69883001 103 17502 237 5903 24806 14

11 5178 4512 4703 1813 4719 1314 6152 6

FC 3 6FD 2 5FE 24489 91FF 413 682

Table 8 BFD and codeword for CT-MONO2-16-chestdcm file

Byte (hexadecimal) BFD Codeword00 938 001011001 495 0100100002 532 0110011103 495 01001001

7D 381 1111111107E 472 001100117F 398 0000001180 222 001001110

FC 266 100000000FD 307 101001010FE 210 1011100011FF 117 1011100010

Journal of Healthcare Engineering 5

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 5: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

(698830 times 2 + 14 times 5

+ 91 times 6) 1398276 bit asymp 174785 byte(2)

In the experimental results we also observe a limitationof Huffman coding where one case generates a compressionratio of less than 1 (ratiolt 1) causing the generated com-pressed file size is larger than the original file size

Table 8 shows the BFD pattern and the codeword fromCT-MONO2-16-chestdcm file which has 1 0992 of com-pression ratio )e BFD value is nearly evenly distributed and

thus causes the created prefix tree to generate codewordswhich are approximately equal to or longer than the re-quired bit length for the creation of 1 byte (8 bit) For the

Table 5 Compression results

File name Original size (byte) No of symbolnode Compression time (s) Compressed size (byte)CT0011dcm 4202378 254 275 1371932CT0012dcm 1052902 223 067 346880CT0013dcm 1052750 209 060 328241CT0014dcm 1053770 254 065 284727CT0031dcm 7871216 256 1725 5057232CT0032dcm 527992 256 065 322986CT0033dcm 3675976 256 650 2592421CT0034dcm 528012 256 078 380848CT0035dcm 3149976 256 302 1395825CT0051dcm 208402 256 024 118713CT0052dcm 259602 256 029 145620CT0055dcm 208416 256 026 127395CT0056dcm 259616 256 031 164952CT0059dcm 362378 256 036 193416CT0081dcm 2607730 256 430 1882801CT0110dcm 4725954 256 887 3196659CT-MONO2-8-abdodcm 262940 217 023 124563CT-MONO2-16-ankledcm 525436 89 029 175696CT-MONO2-16-braindcm 525968 256 061 360802CT-MONO2-16-chestdcm 145136 256 028 145248

Table 6 Compression ratio value and space savings

File name Compressionratio Space savings ()

CT0011dcm 30631 6735CT0012dcm 30353 6705CT0013dcm 32072 6882CT0014dcm 37010 7298CT0031dcm 15564 3575CT0032dcm 16347 3883CT0033dcm 14180 2948CT0034dcm 13864 2787CT0035dcm 22567 5569CT0051dcm 17555 4304CT0052dcm 17827 4391CT0055dcm 16360 3887CT0056dcm 15739 3646CT0059dcm 18736 4663CT0081dcm 13850 2780CT0110dcm 14784 3236CT-MONO2-8-abdodcm 21109 5263CT-MONO2-16-ankledcm 29906 6656

CT-MONO2-16-braindcm 14578 3140

CT-MONO2-16-chestdcm 09992 minus008

Table 7 BFD for CT0013dcm and CT0014dcm file

Byte (hexadecimal) CT0013dcm CT0014dcm00 663477 69883001 103 17502 237 5903 24806 14

11 5178 4512 4703 1813 4719 1314 6152 6

FC 3 6FD 2 5FE 24489 91FF 413 682

Table 8 BFD and codeword for CT-MONO2-16-chestdcm file

Byte (hexadecimal) BFD Codeword00 938 001011001 495 0100100002 532 0110011103 495 01001001

7D 381 1111111107E 472 001100117F 398 0000001180 222 001001110

FC 266 100000000FD 307 101001010FE 210 1011100011FF 117 1011100010

Journal of Healthcare Engineering 5

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 6: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

CT-MONO2-16-chestdcm le the created codewords are 11(7 bit long) 19 (9 bit long) and 2 (10 bit long) e othercodewords are 8 bit long From the total number of createdcodewords only 11 symbols are compressed into 7 bits whilethe others still have the 8 bit length or more (9 bit and 10 bit)Hence the compression of CT-MONO2-16-chestdcm legenerates a larger le size than the original size

Another issue in using Human coding occurs when allbytes or symbols have the same occurrence frequency andthe generated prex tree has a log 2(n) depth where n is thetotal number of symbols (n 256 for each le) e gen-erated codeword for each symbol has a length in the amountof depth from the created prex tree log 2(256) 8 bit Inthis case no compression was generated from the generatedcodeword Figure 2 illustrates the Human coding limita-tion for the same occurrence frequency assuming that eachcharacter only needed 3 bits to be compiled (A 000B 001 C 010 H 111)

Table 9 shows that the generated codeword has the same3 bit length as the initial symbolerefore no compressionoccurred during the process If the value of occurrencefrequency for each symbol is 5 then size of the original le(5 lowast 8 lowast 3120 bit) will be the same as that of the com-pressed le (5 lowast 8 lowast 3120 bit) is illustration is quitesimilar to the case of CT-MONO2-16-chestdcm le wherethe BFD values for the bytes have nearly the same valuewithout centering on a much greater value Consequentlythe created prex tree generates the same length ofcodeword as the initial bit length (8 bit) one fraction with7-bit length and others with 9- and 10-bit lengthserefore the size of the compressed le becomes largerthan that of the original le

42 Security Aspect One of the problems in the security ofthe DICOM le is that the information of the DICOM leitself can be easily read by using general text editors likeNotepad or WordPadis feature is a serious threat as evenpixel data can be read by only with taking a few last bytesfrom the DICOM le Figures 3 and 4 show the structurecomparison between the original DICOM and the com-pressed les e symbol character 129 to 132 previouslyread as ldquoDICMrdquo on the DICOM le is unreadable on thecompressed le e prex ldquo12480 rdquo previously was ableto be read which indicates that the DICOM le is no longeravailable in the compressed DICOM le

All characters that can be directly read previously suchas patientrsquos name doctorrsquos name hospital and datechange to unique characters in a compressed le e set oflast bytes no longer represents the pixel data from theoriginal DICOM le Now interpreting the compressedDICOM le is dicentcult without decompressing the le andthe process for decompression is only known by the usere worst thing that can be done by anyone on the

ABCDEFGH

ABCD EFGH

AB CD EF GH

A B C D E F G H

10

110 0

1 0 1 0 1 0 1 0

Figure 2 Prex tree for symbol with the same appearance frequency

Table 9 Codeword for symbol with the same occurrencefrequency

Symbol CodewordA 111B 110C 101D 100E 011F 010G 001H 000

6 Journal of Healthcare Engineering

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 7: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

compressed file is to change the structure or byte distri-bution from the file)is change may cause decompressionthat generates a file with a different bit structure from theoriginal DICOM file As a result the generated file be-comes unreadable in the DICOM viewer or corrupted )eresult of the decompression process from the compressedfile to be converted into the original DICOM file is shownin Table 10

43 Huffman Coding Compression Time )e duration ofcompression is proportional to the generated compressedfile size )e larger the generated file size the longer the timerequired for compression Figure 5 shows the effect ofduration of compression on the generated file size

In Huffman coding inputted data are traversed whenthey receive a BFD value and when they are assigning acodeword for every symbol If the input file size is defined asn then the time needed will be 2lowast n For the prefix tree if thenodes and symbols are defined as | 1113936 | with the depth oflog 2| 1113936 | then the prefix treersquos size becomes 2lowast| 1113936 | (the totalnumber of final nodes) From this state we can obtain atraversed prefix tree for the collection of symbol codewordswhich can be represented according to the construction timeof the prefix treersquos best and worst cases

)e time required for decompressing depends on thecompressed file size which is caused by the search to de-compress all the bits in the file from the initial to the last bitFigure 6 shows the effect of duration of decompression onthe decompressed file size Table 10 shows that a significantlylonger time is required for the decompression process thanfor the compression time For example the CT0031dcm filewas decompressed for 10193 seconds but compressed foronly 1725 seconds However the compression and de-compression times are both proportional to the compressedsize and not to the original file size Figure 6 displays the

correlation graph between the compressed file sizes with therequired decompression time

44 Best Case Complexity )e best case in Huffman codingoccurs when the constructed prefix tree forms a shape ofperfectly height-balanced tree where subtrees in the left andthe right form one node that has similar height In theperfectly height-balanced prefix tree the traversed time foreach symbol will take log 2| 1113936 | )erefore every symbol willtake | 1113936 |lowast log 2| 1113936 | and if we assume the length of the datais n the best case complexity will become2lowast n + | 1113936 |lowast log 2| 1113936 | or O(n + | 1113936 |lowast log 2| 1113936 |)

45 Worst Case Complexity )e worst case in Huffmancoding occurs when the constructed prefix tree forms ashape of a degenerated or linear tree which is called theunbalanced tree )is case takes time to reach one node as| 1113936 | )erefore reaching a codeword for every symbol takes| 1113936 |lowast| 1113936 | | 1113936 |2 )us the time complexity for worst casebecomes 2lowast n + | 1113936 |2 or O(n + | 1113936 |2)

46 Comparison with JPEG2000 Lossless JPEG2000 com-pression implementation was applied to the same set ofDICOM files listed in Table 4 in order to get a benchmarkperformance comparison with Huffman coding Figure 7depicts the results of comparing the compression time andsize of JPEG2000 and Huffman coding Overall from theresults we can observe that Huffman coding compressionperformance was comparable with the standard JPEG2000compression method with slightly faster compression timefor CT images However JPEG2000 still outperformsHuffman coding on compressing large DICOM image suchas CR DR and angiography

Figure 3 CT0011dcm file hex content

Journal of Healthcare Engineering 7

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 8: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

Nevertheless Huffman coding maintains the original fileformat and size while JPEG2000 being not backwardcompatible changes the original size upon decompression)is comparable performance gives Huffman coding anadvantage to be an alternative implementation of DICOMimage compression in open PACS settings due to JPEG2000proprietary implementation

5 Conclusions

Huffman coding can generate compressed DICOM file withthe value of the compression ratio up to 1 37010 and space

savings of 7298 )e compression ratio and percentage ofthe space savings are influenced by several factors such asnumber of symbols or initial node used to create prefix treeand the pattern of BFD spread from the compressed file )etime required for compressing and decompressing is pro-portional to the compressed file size )at is the larger thecompressed file size the longer the time required to com-press or to decompress the file

Huffman coding has time complexityO(n + | 1113936 |lowast log 2| 1113936 |) and space complexity O(| 1113936 |)where n is the read input file size and | 1113936 | is the number ofsymbols or initial nodes used to compile the prefix tree

Figure 4 CT0011HUF file hex content

Table 10 Decompression results

File name Original size (byte) Time (s) Decompressed size (byte)CT0011dcm 1371932 2104 4202378CT0012dcm 346880 592 1052902CT0013dcm 328241 566 1052750CT0014dcm 284727 408 1053770CT0031dcm 5057232 10193 7871216CT0032dcm 322986 606 527992CT0033dcm 2592421 4805 3675976CT0034dcm 380848 718 528012CT0035dcm 1395825 2391 3149976CT0051dcm 118713 241 208402CT0052dcm 145620 293 259602CT0055dcm 127395 262 208416CT0056dcm 164952 338 259616CT0059dcm 193416 383 362378CT0081dcm 1882801 3782 2607730CT0110dcm 3196659 6576 4725954CT-MONO2-8-abdodcm 124563 228 262940CT-MONO2-16-ankledcm 175696 234 525436CT-MONO2-16-braindcm 360802 704 525968CT-MONO2-16-chestdcm 145248 301 145136

8 Journal of Healthcare Engineering

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 9: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

120

100

80

60

40

Dur

atio

n of

dec

ompr

essio

n (s

)

20

00 1000000 2000000 3000000

Decompressed file size (byte)4000000 5000000 6000000

Figure 6 File size eect on duration of decompression

0

2

4

6

8

10

12

14

16

18

20

0 1000000 2000000 3000000 4000000 5000000 6000000

Dur

atio

n of

com

pres

sion

(s)

Compressed file size (byte)

Figure 5 File size eect on duration of compression

060992

57838 45379 138927 192596 101736 76717Compression size (byte)

92479 366536 135878 5962750763 117624 192591 80666 88472 109063 158695 84230 240229

2

4

6

8

10

Com

pres

sion

time (

s)

12

14

16

18

20

JPEG2000Huffman

Figure 7 Performance comparison with JPEG2000

Journal of Healthcare Engineering 9

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 10: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

)e structure of the compressed file cannot be easilyinterpreted as a DICOM file by using a general text editorsuch as Notepad or WordPad In addition Huffman codingis able to compress and decompress with preserving anyinformation in the original DICOM file

)ere is a limitation which is at one time the techniquestops compressing when each symbol or node from thecompressed file has the same occurrence frequency (allsymbols have the same value of BFD)

Compression of the DICOM image can be conductedonly in the pixel data from the image without changing theoverall file structure so the generated compressed file is stillbe able to be directly read as the DICOM file in the com-pressed pixel data size )us for future research encryptionfor the important information of DICOM files such aspatient ID name and date of birth is considered tostrengthen the secureness of the data

Lastly we also plan to further evaluate Huffman codingimplementation for inclusion in the popular dcm4HCEEopen PACS implementation Such study will focus ontransfer time compression and decompression until theimage reading quality evaluation

Data Availability

)e partial data are obtained from publicly available medicalimage data from httpmmntnet while a small set of privatedata are obtained from local hospital after anonymizing thepatient information details

Conflicts of Interest

)e authors declare that there are no conflicts of interestregarding the publication of this paper

Acknowledgments

)is work was supported in part by the FRGS project fundedby the Malaysian Ministry of Education under Grant noFRGS22014ICT07MUSM031

References

[1] O S Pianykh Digital Imaging and Communications inMedicine (DICOM) A Practical Introduction and SurvivalGuide Springer-Verlag Berlin Germany 2nd edition 2012

[2] T C Piliouras R J Suss and P L Yu ldquoDigital imaging ampelectronic health record systems implementation and regu-latory challenges faced by healthcare providersrdquo in Pro-ceedings of the Long Island Systems Applications andTechnology Conference (LISAT 2015) pp 1ndash6 IEEE Farm-ingdale NY USA May 2015

[3] P M A van Ooijen K Y Aryanto A Broekema and S HoriildquoDICOM data migration for PACS transition procedure andpitfallsrdquo International journal of computer assisted radiologyand surgery vol 10 no 7 pp 1055ndash1064 2015

[4] Y Yuan L Yan Y Wang G Hu and M Chen ldquoSharing oflarger medical DICOM imaging data-sets in cloud comput-ingrdquo Journal of Medical Imaging and Health Informaticsvol 5 no 7 pp 1390ndash1394 2015

[5] D Salomon and G Motta Handbook of Data CompressionSpringer Science amp Business Media Berlin Germany 2010

[6] P M Parekar and S S )akare ldquoLossless data compressionalgorithmmdasha reviewrdquo International Journal of ComputerScience amp Information Technologies vol 5 no 1 2014

[7] F Garcia-Vilchez J Munoz-Mari M Zortea et al ldquoOn theimpact of lossy compression on hyperspectral image classi-fication and unmixingrdquo IEEE Geoscience and remote sensingletters vol 8 no 2 pp 253ndash257 2011

[8] V Kumar S Barthwal R Kishore R Saklani A Sharma andS Sharma ldquoLossy data compression using Logarithmrdquo 2016httparxivorgabs160402035

[9] M Fatehi R Safdari M Ghazisaeidi M Jebraeily andM Habibikoolaee ldquoData standards in tele-radiologyrdquo ActaInformatica Medica vol 23 no 3 p 165 2015

[10] T Richter A Artusi and T Ebrahimi ldquoJPEG XT a newfamily of JPEG backward-compatible standardsrdquo IEEEMultiMedia vol 23 no 3 pp 80ndash88 2016

[11] D Haak C-E Page S Reinartz T Kruger andT M Deserno ldquoDICOM for clinical research PACS-integrated electronic data capture in multi-center trialsrdquoJournal of Digital Imaging vol 28 no 5 pp 558ndash566 2015

[12] F Liu M Hernandez-Cabronero V Sanchez M Marcellinand A Bilgin ldquo)e current role of image compressionstandards in medical imagingrdquo Information vol 8 no 4p 131 2017

[13] S Priya ldquoA novel approach for secured transmission ofDICOM imagesrdquo International Journal of Advanced In-telligence Paradigms vol 12 no 1-2 pp 68ndash76 2019

[14] J Bartrina-Rapesta V Sanchez J Serra-SagristaM W Marcellin F Aulı-Llinas and I Blanes ldquoLosslessmedical image compression through lightweight binaryarithmetic codingrdquo in Proceedings of the Applications ofDigital Image Processing XL SPIE Optical Engineering +Applications vol 10396 article 103960S International Societyfor Optics and Photonics San Diego CA USA 2017

[15] S S Parikh D Ruiz H Kalva G Fernandez-Escribano andV Adzic ldquoHigh bit-depth medical image compression withhevcrdquo IEEE Journal of Biomedical and Health Informaticsvol 22 no 2 pp 552ndash560 2018

[16] D A Koff and H Shulman ldquoAn overview of digital com-pression of medical images can we use lossy image com-pression in radiologyrdquo Journal-Canadian Association ofRadiologists vol 57 no 4 2006

[17] K Kavinder ldquoDICOM image compression using Huffmancoding technique with vector quantizationrdquo InternationalJournal of Advanced Research in Computer Science vol 4no 3 2013

[18] P Ezhilarasu N Krishnaraj and V S Babu ldquoHuffman codingfor lossless data compression-a reviewrdquo Middle-East Journalof Scientific Research vol 23 no 8 pp 1598ndash1603 2015

[19] A J Maan ldquoAnalysis and comparison of algorithms forlossless data compressionrdquo International Journal of In-formation and Computation Technology vol 3 no 3 2013ISSN 0974-2239

[20] H P Medeiros M C Maciel R Demo Souza andM E Pellenz ldquoLightweight data compression in wirelesssensor networks using Huffman codingrdquo InternationalJournal of Distributed Sensor Networks vol 10 no 1 article672921 2014

[21] T Kumar and D R Kumar ldquoMedical image compressionusing hybrid techniques of DWT DCTand Huffman codingrdquoIJIREEICE vol 3 no 2 pp 54ndash60 2015

10 Journal of Healthcare Engineering

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 11: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

[22] F Fahmi M A Sagala T H Nasution and AnggraenyldquoSequential ndash storage of differences approach in medicalimage data compression for brain image datasetrdquo in In-ternational Seminar on Application of Technology of In-formation and Communication (ISemantic) pp 122ndash125Semarang Indonesia 2016

[23] F Fahmi T H Nasution and A Anggreiny ldquoSmart cloudsystem with image processing server in diagnosing braindiseases dedicated for hospitals with limited resourcesrdquoTechnology and Health Care vol 25 no 3 pp 607ndash610 2017

[24] B O Ayinde and A H Desoky ldquoLossless image compressionusing Zipper transformationrdquo in Proceedings of the In-ternational Conference on Image Processing Computer Vision(IPCVrsquo16) Las Vegas NV USA 2016

[25] S G Miaou F S Ke and S C Chen ldquoJPEGLS a losslesscompression method for medical image sequences usingJPEG-LS and interframe codingrdquo IEEE Transactions on In-formation Technology in Biomedicine vol 13 no 5pp 818ndash821 2009

[26] B Xiao G Lu Y Zhang W Li and GWang ldquoLossless imagecompression based on integer discrete Tchebichef transformrdquoNeurocomputing vol 214 pp 587ndash593 2016

[27] A Martchenko and G Deng ldquoBayesian predictor combina-tion for lossless image compressionrdquo IEEE Transactions onImage Processing vol 22 no 12 pp 5263ndash5270 2013

[28] Y Chen and P Hao ldquoInteger reversible transformation tomake JPEG losslessrdquo in Proceedings of the IEEE InternationalConference on Signal Processing vol 1 pp 835ndash838 OrlandoFL USA May 2004

[29] E Iain Richardson Be H264 Advanced Video CompressionStandard Wiley Hoboken NJ USA 2nd edition 2010

[30] M Ruckert Understanding MP3 Syntax Semantics Mathe-matics and Algorithms Vieweg Verlag Berlin Germany2005

Journal of Healthcare Engineering 11

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom

Page 12: Analysis of DICOM Image Compression Alternative …downloads.hindawi.com/journals/jhe/2019/5810540.pdfAnalysis of DICOM Image Compression Alternative Using Huffman Coding Romi Fadillah

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Active and Passive Electronic Components

VLSI Design

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Shock and Vibration

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

SensorsJournal of

Hindawiwwwhindawicom Volume 2018

International Journal of

RotatingMachinery

Hindawiwwwhindawicom Volume 2018

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Navigation and Observation

International Journal of

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom