icip2004

Upload: ashu-garg

Post on 06-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 icip2004

    1/4

    AUTOMATIC GENERATION OF C++/JAVA CODE FOR BINARY ARITHMETIC CODING

    Danny Hong and Alexandros Eleftheriadis

    Columbia UniversityDept. of Electrical Engineering

    New York, NY 10027, USA

    ABSTRACT

    Binary arithmetic coding is, compression-wise, the most effective sta-

    tistical coding method used in image and video compression. It is be-

    ing used for compressing bi-level images (JBIG, JBIG2, and MPEG-

    4 shape coding) and is also being utilized (optionally) for coding of

    continuous-tone images (JPEG) and videos (H.264). Despite its wide

    use, different arithmetic coders are incompatible with each other and

    application developers are faced with the difficult task of understand-

    ing and building each coder. We present a set of simple parameters

    that can be used to describe any binary arithmetic coder that is cur-

    rently being deployed, and we also introduce a software tool for au-

    tomatically generating C++/Java code for binary arithmetic coding

    according to the description.

    1. INTRODUCTION

    Huffman coding [1] is arguably the most widely used statistical com-

    pression mechanism for media representation (e.g., Group 3 [2] and

    Group 4 [3] fax, MPEG-1 [4], MPEG-2 [5], and etc.). It is proven

    to be optimal among instantaneous (prefix) codes as it can repre-

    sent any given random variable within 1 bit of its entropy. Arith-

    metic coding [6, 7], derived from Elias coding [8], is another statis-

    tical coding method proven to yield better compression than Huff-

    man coding; however, it has not been widely used for media cod-

    ing due to its complexity and patent issues. The very first, practical

    arithmetic coder was developed for compressing bi-level images (the

    Skew coder [9, 10] and the Q-Coder [11]), as Huffman coding can-

    not compress binary symbols, unless groups of symbols are coded at

    a time. Run-length coding (e.g., Golomb coding [12]) is a good al-

    ternative coding method for binary symbols, when the probability of

    one symbol is much higher than the other. Nevertheless it is a static

    coding method and for the best result for all possible binary source

    sequences, using an adaptive binary arithmetic coder (BAC) yields

    better compression.

    Even today, most of the practical arithmetic coders deal solely

    with binary alphabets: binary arithmetic coding is computationally

    simple and it makes using higher-order conditioning models feasi-ble. In some cases it is only natural to assume a binary source. For

    instance, JBIG [13], JBIG2 [14], and MPEG-4 shape coding [15] fo-

    cus on coding of bi-level images (JBIG and JBIG2 can also be used

    for grayscale images where bit-plane by bit-plane coding is applied),

    and for JPEG2000 [16] and MPEG-4 texture coding [15], bit-plane

    coding is ultimately applied. On the other hand, arithmetic coding

    can optionally be used in JPEG [17] and H.264 [18], and in these

    cases, each syntactic element is first binarized so that a BAC can be

    This material is based upon work supported in part by the National Sci-ence Foundation under Grant ACI-0313116.

    used. Despite such wide use of BACs, different BACs are gener-

    ally incompatible with each other (e.g., the code string generated by

    an arithmetic encoder specified in JBIG cannot be correctly decoded

    by an arithmetic decoder specified for MPEG-4 shape coding); as a

    remedy, we present a unique solution that unifies binary arithmetic

    coders (BACs). We define a set of parameters that can be used to

    automatically generate different variants of BACs.

    Arithmetic coding can be separated into two main parts: mod-

    eling and coding. The modeling part appropriately selects one ormore structures for conditioning events, and gathers the relative fre-

    quencies of the conditioned events [9, 19], which correspond to the

    event probabilities. Modeling, by itself, is a huge topic and there

    are numerous effective models that have been introduced. The H.264

    standard alone defines more than 300 models to account for different

    structures each bit of the binarized syntactic elements might have.

    Consequently, unifying modeling is an extremely difficult task (if not

    impossible), and we focus only on the coding part.

    Flavor [20, 21] is a language that has been developed to describe

    the syntax of any compressed bitstream so that the bitstream parsing

    and generation code can be automatically generated. Flavor already

    has constructs for describing variable-length codes and we comple-

    ment it by introducing a set of new constructs for describing binary

    arithmetic codes. Using Flavor with the new constructs, the cod-

    ing part of any BAC can be easily described and the correspondingC++/Java code can be automatically generated. As a result, applica-

    tion developers can solely concentrate on the modeling part, which

    has been shown to have a very high impact on the compression effec-

    tiveness of the BAC compared to the coding part.

    The next section briefly describes the main concept behind binary

    arithmetic coding, Sections 3 and 4 present the constructs needed to

    describe practical BACs, and we conclude with Section 5.

    2. BACKGROUND

    A high-level pseudo-code describing the basic concept of binary arith-

    metic coding is depicted in Figure 1. The variable R represents the

    current interval (initially 1) and the interval is divided into two subin-

    tervals (R0 and R1) according to the probabilities of the two possiblesymbols (P0 and P1=1-P0). The variable L represents the lower

    bound of the current interval (the interval is represented as [L, L+R))

    and if the symbol 0 is being coded, then, assuming that R0 is al-

    ways above R1, the new interval is [L+R1, L+R1+R0); likewise, for

    symbol 1, the new interval is [L, L+R1). For coding of each sym-

    bol, the current interval gets subdivided and at the end, the minimum

    number of bits that can uniquely represent the final interval gets out-

    put as the code string. For decoding, the variable V represents the

    code string and the decoder essentially mimics the encoding process

    to deduce the original symbols. X represents the current symbol en-

    coded/decoded.

  • 8/2/2019 icip2004

    2/4

    1) R0 = R*P02) R1 = R-R03) if (X == 0) R = R0, L = L+R1 else R = R1

    (a) Encoding

    1) R0 = R*P02) R1 = R-R03) if (V-L >= R1) R = R0, L = L+R1, X = 0 else R = R1, X = 1

    (b) Decoding

    Fig. 1. Binary Elias coding.

    The current interval of an arithmetic coder is referred to as the

    state (or internal state) of the coder, and there are many ways to rep-

    resent it. As in the above example, we can use L and R. Alternatively,

    let H be the upper (higher) bound of the current interval, and the inter-

    val can be represented by [L, H). There are also other representations

    but as long as the intervals do not overlap, the decoder can yield cor-

    rect symbols. Additionally, all units of the current interval should

    be assigned to any one of the subintervals to maximize compression.

    The Flavor-generated code uses the [L, L+R) interval convention.

    3. INTEGER ARITHMETIC CODING

    To overcome the precision problem inherent in Elias coding, most of

    the practical arithmetic coders are implemented using integer arith-

    metic with renormalization [7, 11]. Though it is possible to use

    floating-point numbers, integer arithmetic is preferred for its simplic-

    ity and better portability. As a consequence of using integer arith-

    metic, the probabilities of the symbols are represented by respective

    counts (C0and C1), and the corresponding integer, binary arithmetic

    coding process is shown in Figure 2. In the following, we describe

    the set of parameters that can be set to describe any integer BAC.

    1) R0 = R * C0 / (C0+C1)2) R1 = R-R0

    3) if (X == 0) R = R0, L = L+R1 else R = R14) renormalize

    (a) Encoding

    1) R0 = R * C0 / (C0+C1)2) R1 = R-R0

    3) if (V-L >= R1) R = R0, L = L+R1, X = 0 else R = R1 X = 14) renormalize

    (b) Decoding

    Fig. 2. Binary arithmetic coding.

    1) Precision (B). B is the number of bits used to represent the

    current interval. Using the specified value for B, Flavor defines two

    constants HALF=1

  • 8/2/2019 icip2004

    3/4

    coding, an extra variable V is maintained, which always contains

    the first B un-decoded bits of the code string; however, in certain

    cases, it may initially contain less than B bits (e.g., MPEG-4 shape

    coding). Alternatively, instead of maintaining (L, R, V) in the de-

    coder, just (R, D) can be maintained where D=V-L. Using D instead

    of V and L makes the decoder simpler [22]; however, less option

    is given in terms of ending the coder (see the next paragraph). In

    Flavor the syntax init(R=?,V=?) or init(R=?,D=?) can beused to determine how to initialize the coder. For example, using

    init(R=HALF,D=B) generates an initialization code compatible

    with the TOIS implementation [22].

    8) Disambiguating the last symbol. When terminating, some

    information that is still left in the coder must be output to enable

    the decoder to unambiguously decode the last symbol. We call this

    disambiguating the last symbol and this is different from the actual

    termination problem, which is notifying the decoder when to stop. If,

    after encoding the last symbol, R>=21 different

    values. For example, the M coder used by an H.264 video coder

    allows R to take U=4 different values. Then the values of R LPS

    for, not one but, four different values of R are pre-specified. As a

    result, the table is bigger than the one used for the Q-Coder; but, the

    M coder yields better compression. In the M coder, the RTable isaccessed by two indices i and r. As in the Q-Coder, the index

    i determines the current probability distribution and the index r

    [0, U) determines the current range. To maximize the coding speed

    of the M coder, U is restricted to be a power of 2, i.e., U=1q)&((1

  • 8/2/2019 icip2004

    4/4

    we can specify the transition to take place whenever renormalization

    takes place. The conditional exchange defined by the Switch table

    also takes place according to the Trans setting.

    FAC{ B=10,SOC=LM,R>=QTR,

    CO=FO,init(R=HALF-2, D=B-1),RTable(LPS)=MTable(64, 4),Next(LPS)=NextLPS(64),Next(MPS)=NextMPS(64) }

    Fig. 5. Flavor description of the H.264 arithmetic coder.

    The quasi-arithmetic coder [27] is perhaps the fastest BAC, as

    it replaces all arithmetic operations with table lookups; all calcula-

    tions are done in advance. It takes the idea of the M coder and the

    Q-Coder a step further, and in addition to the RTable and the transi-

    tion tables, four additional tables are specified. Two tables, which can

    be specified using the Out keyword, are needed to specify the output.

    For example the Out(LPS)=OutLPS(64,4) construct defines the

    OutLPS table where each entry of the table indicates the output forthe corresponding entry in RTable (with 64x4 entries) when the

    symbol to be coded is an LPS; likewise for the MPS, Out(MPS) can

    be used. Two additional tables, specified using the NextR keyword,

    are used for determining the next r index (as the transition tables

    defined using the Next keyword are used to determine the next i

    index).

    5. CONCLUSION

    We have described a set of new constructs (based on common features

    in BACs that are currently being deployed) for Flavor, which can be

    used to describe any BACs. Then, using the Flavor translator, C++ or

    Java code can be automatically generated for the described arithmetic

    coder.

    6. REFERENCES

    [1] D. A. Huffman, A Method for the Construction of Minimum

    Redundancy Codes, Proceedings of the IRE, vol. 40, pp. 1098

    1101, 1952.

    [2] CCITT (ITU Recommendation T.4), Standardization of Group

    3 Facsimile Apparatus for Document Transmission, 1980,

    amended in 1984 and 1988.

    [3] CCITT (ITU Recommendation T.11), Facsimile Coding

    Schemes and Coding Control Functions for Group 4 Facsimile

    Apparatus, 1984, amended in 1988.

    [4] ISO/IEC 11172 International Standard (MPEG-1), Information

    technology Coding of moving pictures and associated audio

    for digital storage media at up to about 1,5 Mbits/s, 1993.

    [5] ISO/IEC 13818 International Standard (MPEG-2), Information

    technology Generic coding of moving pictures and associated

    audio information, 1996.

    [6] G. G. Langdon, An Introduction to Arithmetic Coding, IBM

    J. Res. Develop., vol. 28, pp. 135149, 1984.

    [7] I. H. Witten, R. M. Neal, and J. G. Cleary, Arithmetic Coding

    for Data Compression, Communications of ACM, vol. 30, pp.

    520540, 1987.

    [8] F. Jelinek, Probabilistic Information Theory, McGraw-Hill,

    1968.

    [9] G. G. Langdon and J. Rissanen, Compression of Black-White

    Images with Arithmetic Coding, IEEE Trans. on Communica-

    tions, vol. 29, pp. 858867, 1981.

    [10] G. G. Langdon and J. Rissanen, A Simple General Binary

    Source Code, IEEE Trans. on Info. Theory, vol. 28, pp. 800803, 1982.

    [11] W. B. Pennebaker and et al., An Overview of the Basic Princi-

    ples of the Q-Coder Adaptive Binary Arithmetic Coder, IBM

    J. Res. Develop., vol. 32, pp. 717726, 1988.

    [12] S. W. Golomb, Run-Length Encodings, IEEE Trans. on Info.

    Theory, vol. 12, pp. 399401, 1966.

    [13] ISO/IEC 11544 International Standard (JBIG), Information

    Technology - Coded Representation of Picture and Audio In-

    formation - Progressive Bi-level Image Compression, 1993.

    [14] ISO/IEC 14492 International Standard (JBIG2), Information

    Technology - Lossy/Lossless Coding of Bi-level Images, 2001.

    [15] ISO/IEC 14496-2 International Standard (MPEG-4:2), Infor-

    mation technology Coding of audio-visual objects Part 2:Video, 1999.

    [16] ISO/IEC 15444 International Standard (JPEG 2000), Informa-

    tion technology - JPEG 2000 image coding system, 2000.

    [17] ISO/IEC 10918 International Standard (JPEG), Information

    technology Digital compression and coding of continuous-

    tone still images, 1994.

    [18] ISO/IEC 14496-10 International Standard (MPEG-4:10), Kla-

    genfurt, AT, Information Technology - Coding of Audio-Visual

    Objects - Part 10: Advanced Video Coding (FDIS), 2003.

    [19] J. Rissanen and G. G. Langdon, Universal Modeling and Cod-

    ing, IEEE Trans. on Info. Theory, vol. 27, pp. 1223, 1981.

    [20] A. Eleftheriadis, Flavor: A Language for Media Representa-

    tion, in ACM Int. Conf. on Multimedia, 1997, Proceedings, pp.19.

    [21] Y. Fang and A. Eleftheriadis, Automatic Generation of Entropy

    Coding Programs Using Flavor, in IEEE Workshop on Multi-

    media Signal Processing, 1998, Proceedings, pp. 341346.

    [22] A. Moffat, R. M. Neal, and I. H. Witten, Arithmetic Coding

    Revisited, ACM Transactions on Information Systems, vol. 16,

    pp. 256294, 1998.

    [23] M. Schindler, A Fast Renormalisation for Arithmetic Coding,

    in IEEE Data Compression Conference, 1998, Proceedings, p.

    572.

    [24] L. Stuiver and A. Moffat, Piecewise Integer Mapping for

    Arithmetic Coding, in IEEE Data Compression Conference,

    1998, Proceedings, pp. 312.[25] A. Moffat and A. Turpin, Compression and Coding Algorithms,

    Kluwer Academic, 2002.

    [26] D. Marpe and T. Wiegand, A Highly Efficient Multiplication-

    Free Binary Arithmetic Coder and Its Application in Video

    Coding, in IEEE Int. Conf. on Image Processing, 2003, Pro-

    ceedings, pp. 263266.

    [27] P. G. Howard and J. S. Vitter, Practical Implementations of

    Arithmetic Coding, in Image and Text Compression, 1992,

    Kluwer Academic, pp. 85112.