hm inter prediction 111022 r3
Post on 28-Dec-2015
30 Views
Preview:
TRANSCRIPT
HEVC Inter prediction
광운대학교 영상처리시스템연구실
2011-10-22 (SAT)
Contents
• Overview of inter prediction
• Inter prediction in HEVC
– GOP coding structure
– Adaptive motion vector prediction(AMVP)
– Merge
– Asymmetric motion partition(AMP)
– Interpolation filter
OVERVIEW OF INTER PREDICTION
Overview of inter prediction
• The encoder forms a model of the current frame based on the samples of a previously transmitted frame
• Motion-compensated predicted frame is subtracted from the current frame to reduce a residual ‘error’ frame
• Transform coding of the residual frame
Current frame
Residual frame
Motion-compensated
frame
Motion estimation
Previous frame
_
Overview of inter prediction
• The goals of inter prediction
– ME creates a model of the current frame based on available data in one or more previously encoded frames to match the current frame as closely as possible
n-1 frame n frame
Overview of inter prediction
• Transmitted data
– Motion vector (PMV, MVD)
– Reference index (LIST_0/LIST_1)
– Prediction mode
– Residual data (quantized coefficients)
… …
… …
How?
GOP CODING STRUCTURE
GOP coding structure
• Temporal prediction structure
– All Intra (No temporal prediction is allowed)
– Low Delay (LD)
• The first picture shall be coded as IDR picture
• GPB (Generalized P and B) picture (on/off)
– Random access (RA)
• Hierarchical B structure shall be used for coding
• IDR Intra picture or CDR(clean random access) picture shall be inserted cyclically per about one second in random access point
GOP coding structure – Low delay
IDR or
Intra picture GPB(Generalized P and B) picture
0
1
2
4
5 3
6
7
8
time
QPI
QPBL3=QPI+3
QPBL2=QPI+2
QPBL3 QPBL3 QPBL3
QPBL2
QPBL1=QPI+1 QPBL1
GOP coding structure – Random access
IDR or
Intra picture GPB(Generalized P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B Picture
Non-referenced B Picture
8
4
1
2
3 5
6
7
0
QPI
QPBL4=QPI+4 QPBL4 QPBL4 QPBL4
QPBL3=QPI+3 QPBL3
QPBL2=QPI+2
QPBL1=QPI+1
POC
Coding
order
GOP coding structure – Random access
Variables: m_iHrchDepth = log2GOP_size + 1; iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; iNumPicRcvd = GOP_size;
for( iDpeth=0; iDepth<m_iHrchDepth; iDepth++ ) { iTimeOffset = (1<<m_iHrchDepth-1-iDepth); iStep = iTimeOffset<<1; for(;iTimeOffset<=iNumPicRcvd; ) { compressSlice(); iTimeOffset += iStep; } }
IDR or
Intra picture
GPB(Generalized
P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B
Picture
Non-
referenced B
Picture
8
4
1
2
3 5
6
7
0
: Depth == 0
: Depth == 1 : Depth == 2
: Depth == 3
*uiPOCCurr = iPOCLast – (iNumPicRcvd – iTimeOffset);
AMVP (ADAPTIVE MOTION VECTOR PREDICTION)
MV prediction of H.264/AVC
• Median of each component of MV
– No transmission overhead
• Slice-based use of temporal MV predictor
C B
A
Current Block 𝑀𝑉𝑥 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑥, 𝐵𝑥, 𝐶𝑥)
𝑀𝑉𝑦 = 𝑀𝐸𝐷𝐼𝐴𝑁(𝐴𝑦 , 𝐵𝑦 , 𝐶𝑦)
Fig. Spatial neighboring block
MV prediction of HEVC
• Explicit signaling of MV predictor index
– Transmission overhead
• PU-based use of temporal MV predictor
B1
A1
B2 B0
A0
Current Block
Fig. Spatial AMVP candidates
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
AMVP
• Decoder receives
– ref_idx
– mvd
– mvp_idx
B1
A1
B2 B0
A0
Current Block
Fig. Spatial AMVP candidates
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
AMVP – Decoder side AMVP syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) | ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v) | ae(v) } else { if( slice_type = = B ) { if( !entropy_coding_mode_flag ) { combined_inter_pred_ref_idx ue(v) if( combined_inter_pred_ref_idx == MaxPredRef ) inter_pred_flag[ x0 ][ y0 ] ue(v) } else inter_pred_flag[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_LC ) { if( num_ref_idx_lc_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_lc_minus4[ x0 ][ y0 ] ue(v) } else ref_idx_lc[ x0 ][ y0 ] ae(v) } mvd_lc[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_lc[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_lc[ x0 ][ y0 ] ue(v) | ae(v) }
AMVP – Decoder side AMVP syntax
else { /* Pred_L0 or Pred_BI */ if( num_ref_idx_l0_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l0[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l0[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l0[ x0 ][ y0 ] ue(v) | ae(v) } if( inter_pred_flag[ x0 ][ y0 ] = = Pred_BI ) { if( num_ref_idx_l1_active_minus1 > 0 ) { if( !entropy_coding_mode_flag ) { if( combined_inter_pred_ref_idx == MaxPredRef ) ref_idx_l1_minusX[ x0 ][ y0 ] ue(v) } else ref_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } mvd_l1[ x0 ][ y0 ][ 0 ] se(v) | ae(v) mvd_l1[ x0 ][ y0 ][ 1 ] se(v) | ae(v) mvp_idx_l1[ x0 ][ y0 ] ue(v) | ae(v) } } } }
AMVP – Encoder side processing
1. Search for three candidates (spatial:2, temporal:1)
2. Remove redundant MVPs
3. Additional candidate list – Zero vector candidates are created by combining zero vector and
refIdx
4. Decision of best MVP before motion estimation – Distortion : SAD
– Rate: Truncated unary code (MVP index)
– RDCost = Distortion + (Bits*λ + 0.5)>>16;
5. Decision of the best MVP candidate after motion estimation – Best MVP index: smallest mvd = Best_MV – MV of mvp_idx[i]
mvp_idx bin
0 0
1 10
2 110
Starting point for ME
AMVP – Spatial AMVP candidates
• Spatial AMVP candidates
– mvLxA: Left spatial candidates
• Derivation order: A0 ⇒ A1
• First available MV
– 1st: scan without scaling (vec1, vec2)
– 2nd: scan with scaling (vec3, vec4)
– mvLxB: Above spatial candidates
• Derivation order: B0 ⇒ B1 ⇒ B2
• First available MV
– 1st: scan without scaling (vec1, vec2)
– 2nd: scan with scaling, if scaling wasn’t used before (vec3, vec4)
Fig. Spatial AMVP candidates
B1
A1
B2 B0
A0
Current Block
AMVP – Spatial AMVP candidates
• Spatial AMVP candidates
– Four candidates can be derived at each neighboring PU
• vec1: same reference index, same list
• vec2: same reference index, different list
• vec3: different reference index, same list
• vec4: different reference index, different list
time
k l mji picture id
current
block
neighboring
block b
jL0mv
mL1mv
jmvL1
imvL0 1
2
3
4
AMVP – Temporal AMVP candidate
• Temporal AMVP candidate
– Derivation order:
1. Right-bottom position of co-located PU
2. Center position of co-located PU
Co-located PU
Center
Right-bottom
Fig. Temporal AMVP candidates
mvL1
mvL0
current picture
co-located picture
reference picture
Co-located partition
mvL1Col
AMVP - MV Scaling
• Scaling of MV predictor has been modified (JCTVC-F142)
– HM3 rounds half towards plus infinity
– Proposed scheme rounds half towards zero
HM version Modification
HM3 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 128 ≫ 8
HM4 𝑆𝑖𝑔𝑛(𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣)
× 𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟 × 𝑚𝑣 + 127 ≫ 8
𝐷𝑖𝑠𝑡𝑆𝑐𝑎𝑙𝑒𝐹𝑎𝑐𝑡𝑜𝑟: 𝑠𝑐𝑎𝑙𝑖𝑛𝑔 𝑓𝑎𝑐𝑡𝑜𝑟
MERGE
Merge
• Decoder receives
– ref_idx
– mvd
– mvp_idx
– merge_flag
– merge_index
Fig. Merge candidates
D
C B
A
E
Current Block
Co-located PU
Center
Right-bottom
Merge – Decoder side Merge skip syntax
coding_unit( x0, y0, log2CUSize ) { Descriptor if( entropy_coding_mode_flag && slice_type != I ) skip_flag[ x0 ][ y0 ] u(1) |ae(v) if( skip_flag[ x0 ][ y0 ] ) prediction_unit( x0, y0, log2CUSize, log2CUSize, 0 , 0 ) else {
… } }
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */
… } }
• Merge skip
Merge – Decoder side Merge syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor if( skip_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else if( PredMode = = MODE_INTRA ) {
… } else { /* MODE_INTER */ if( entropy_coding_mode_flag || PartMode != PART_2Nx2N ) merge_flag[ x0 ][ y0 ] u(1) |ae(v) if( merge_flag[ x0 ][ y0 ] ) { merge_idx[ x0 ][ y0 ] ue(v)|ae(v) } else {
… } }
• General case - merge
Merge – Encoder side processing
1. Search for five candidates
– Output: Mv, RefIdx, Predflag for LIST_0/LIST_1
– S0, S1, S2, S3: Spatial candidates
– Col: Temporal candidate
2. Remove redundant candidates
3. Additional candidate list (JCTVC-F470)
– Combined bi-directional merge candidate (5 times)
– Scaled bi-directional merge candidate (1 time)
– Zero vector merge candidate
4. Decision of the best MRG candidate
S0 S1 S2 S3 Col merge_idx bin
0 0
1 10
2 110
3 1110
4 1111
Merge – Spatial merge candidates
• Spatial merge candidates (4 candidates)
– Derivation Order: A, B, C, D, E
Fig. Spatial merge candidates
D
C B
A
E
Current Block
Merge – Temporal merge candidate
• refIdx derivation for merge TMVP (JCTVC-E481)
– Decide three refIdx
• refIdxLeft: A
• refIdxAbove: B
• refIdxCorner: C or D or E
– Decide majority of them
– If three of them are not available
• refIdx = 0
– Otherwise
• Set minimum of available refIdx
• Derivation of temporal merge candidate
– Same process with TMVP
D
C B
A
E
Current Block
Merge – Temporal merge candidate
• Example) Decision of reference frame
D
C B
A
E
Current Block
Curr PU
B
A
E
ex) second 4x8 PU in 8x8 CU
Neighbor LIST RefIdx
A LIST_0 0
LIST_1 1
B LIST_0 -1
LIST_1 1
C NULL
D NULL
E LIST_0 -1
LIST_1 1
LIST_0
refIdxLeft 0
refIdxAbove -1
refIdxCorner -1
LIST_1
refIdxLeft 1
refIdxAbove 1
refIdxCorner 1
LIST_0 0
LIST_1 1
Merge – Additional cand. list
1. Combined bi-directional merge candidate (5 times)
mvL0_A(uni) mvL1_B(uni)
mvL1_B(bi)
mvL0_A(bi)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2 mvL0_A, ref 0 mvL1_B, ref 0
3
4
Cur List 0 Ref 0
List 1 Ref 0
Merge – Additional cand. list
2. Scaled bi-directional merge candidate (1 time)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL0’_A, ref 0
3
4
mvL0_A(ref 0)
Cur
mvL0’_A(ref 0)
mvL1_A(ref 1)
List 0 Ref 0
List 0 Ref 1
List 1 Ref 0
List 1 Ref 1
Merge – Additional cand. list
3. Zero vector merge
– Zero vector merge candidates are created by combining zero vector and refIdx
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3 (0,0), ref 0 (0,0), ref 0
4
AMP (ASYMMETRIC MOTION PARTITION)
Asymmetric motion partition (AMP)
• Rectangular shape PU splitting of a block for inter prediction
• AMP is used from the size of 64x64 to 16x16 CU
• AMP improves the coding efficiency, since irregular image patterns
2NxnU 2NxnD nLx2N nRx2N
Asymmetric motion partition (AMP)
Random access HE Random access LC
Y U V Y U V
Class A -0.9 -1.2 -0.9 -0.7 -0.7 -0.5
Class B -0.9 -1.0 -1.0 -0.7 -0.7 -0.6
Class C -0.9 -1.0 -1.1 -0.7 -0.9 -0.9
Class D -0.8 -1.0 -0.9 -0.5 -0.7 -0.6
Class E
Overall -0.9 -1.0 -1.0 -0.7 -0.7 -0.7
Enc Time[%] 144% 151%
Dec Time[%] 99% 99%
Low delay (B) HE Low delay (B) LC
Y U V Y U V
Class A
Class B -1.1 -1.5 -1.5 -0.9 -0.8 -0.6
Class C -1.0 -1.2 -1.3 -0.7 -0.6 -0.7
Class D -1.1 -1.3 -1.5 -0.6 -0.5 -0.9
Class E -2.3 -2.2 -2.4 -1.7 -1.1 -1.3
Overall -1.3 -1.5 -1.6 -0.9 -0.7 -0.8
Enc Time[%] 144% 150%
Dec Time[%] 99% 99%
Table. Experimental result of AMP without encoding speed-up
Random Access HE Random Access LC
Y U V Y U V
Class A -0.5 -0.8 -0.5 -0.4 -0.6 -0.2
Class B -0.5 -0.8 -0.7 -0.4 -0.5 -0.5
Class C -0.6 -0.8 -0.8 -0.5 -0.6 -0.7
Class D -0.5 -0.9 -0.8 -0.4 -0.5 -0.6
Class E
Overall -0.5 -0.8 -0.7 -0.4 -0.6 -0.5
Enc Time[%] 112% 112%
Dec Time[%] 99% 98%
Low delay B HE Low delay B LC
Y U V Y U V
Class A
Class B -0.7 -1.1 -1.2 -0.5 -0.4 -0.3
Class C -0.7 -1.0 -0.9 -0.4 -0.4 -0.7
Class D -0.7 -1.2 -0.8 -0.5 -0.7 -0.2
Class E -1.5 -1.9 -1.7 -1.0 -1.0 -0.9
Overall -0.8 -1.2 -1.1 -0.6 -0.6 -0.5
Enc Time[%] 111% 111%
Dec Time[%] 100% 99%
Table. Experimental result of AMP with encoding speed-up
INTERPOLATION FILTER
Interpolation filter of H.264/AVC
• 1/4th accuracy motion vector
– Cascaded filtering: 6-tap half-pel + bi-linear for luma
– Bi-linear for chroma (1/8th)
Integer-pel – no interpolation
Half-pel – 6-tap
Quarter-pel – 6-tap + bi-linear
Interpolation filter of HEVC
• 1/4th accuracy motion vector
– 1-pass filter: 8-tap for both 1/2nd and 1/4th pel
– 4-tap filter for chroma (1/8th)
Integer-pel – no interpolation
Half-pel – 8-tap
Quarter-pel – 8-tap
Interpolation filter
• Two modifications in HM4.0 and WD4.0
– The motion compensation process to simplify the process by removing rounding operations
– Ensure that all data after each of the vertical and horizontal filtering passes holds in 16-bit memory
– Advantage
• Software simpler
• Text simpler
• No difference in performance
Interpolation filter
• Integer samples
– Upper-case letters
• Fractional sample positions
– Lower-case letters
– For quarter sample luma interpolation
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Interpolation filter coefficients
– Luma
– Chroma
α Filter(α)
1/4 { -1, 4, -10, 57, 19, -7, 3, -1 }
1/2 { -1, 4, -11, 40, 40, -11, 4, -1 }
α Filter(α)
1/8 { -3, 60, 8, -1 }
1/4 { -4, 54, 16, -2 }
3/8 { -5, 46, 27, -4 }
1/2 { -4, 36, 36, -4 }
Interpolation filter
• Luma interpolation process (1D interpolation filter)
– For fractional positions a0,0, b0,0 and c0,0, horizontal 1D filter is used.
– For fractional positions d0,0, h0,0 and n0,0, vertical 1D filter is used.
– The input of 1D interpolation function is integer position values.
– The output is interpolated value X, which has fractional position α.
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Example) 1/2 position b0,0
– 8-tap separable DCTIF coefficient of 1/2 position
{ -1, 4, -11, 40, 40, -11, 4, -1 }
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
𝑏0,0 = −1 ∗ 𝐴−3,0 + 4 ∗ 𝐴−2,0 − 11 ∗ 𝐴−1,0 + 40 ∗ 𝐴0,0 + 40 ∗ 𝐴1,0 − 11 ∗ 𝐴2,0 + 4 ∗ 𝐴3,0 − 1 ∗ 𝐴4,0 + 32 /64
Interpolation filter
• Luma interpolation process (2D separable interpolation filter)
– For remaining positions first horizontal 1D filter is applied for extended block, and then vertical 1D filter is used.
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• Example) 1/4 position e0,0
– 2D separable Interpolation
– 8×horizontal 1D filter + 1×vertical 1D filter
A-1,-1 A0,-1 a0,-1 b0,-1 c0,-1 A1,-1 A2,-1
A-1,0 A0,0 a0,0 b0,0 c0,0 A1,0 A2,0
d-1,0 d0,0 e0,0 f0,0 g0,0 d1,0
h-1,0 h0,0 i0,0 j0,0 k0,0 h1,0
n-1,0 n0,0 p0,0 q0,0 r0,0 n1,0
A-1,1 A0,1 A0,1 b0,1 c0,1 A1,1 A2,1
A-1,2 A0,2 A1,2 A2,2
Interpolation filter
• 1D filtering
• 2D filtering
– Intermediate value should be saved and processed
𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 + 𝑜𝑓𝑓𝑠𝑒𝑡1 ≫ 𝑠𝑖𝑓𝑡1
𝑎0,0 = −1 × 𝐴−3,0 + 4 × 𝐴−2,0 − 10 × 𝐴−1,0 + 57 × 𝐴0,0 + 19 × 𝐴1,0 − 7 × 𝐴2,0 + 4 × 𝐴3,0 − 1 × 𝐴4,0 ≫ 𝑠𝑖𝑓𝑡1
𝑑1𝑖,0 = −1 × 𝐴𝑖,−3 + 4 × 𝐴𝑖,−2 − 10 × 𝐴𝑖,−1 + 57 × 𝐴𝑖,0 + 19 × 𝐴𝑖,1 − 7 × 𝐴𝑖,2 + 4 × 𝐴𝑖,3 − 1 × 𝐴𝑖,4
𝑒0,0 = −1 × 𝑑1−3,0 + 4 × 𝑑1−2,0 − 10 × 𝑑1−1,0 + 57 × 𝑑10,0 + 19 × 𝑑11,0 − 7 × 𝑑12,0 + 4 × 𝑑13,0 − 1 × 𝑑14,0 + 𝑜𝑓𝑓𝑠𝑒𝑡2 ≫ 𝑠𝑖𝑓𝑡2
𝑒0,0 = −1 × 𝑎−3,0 + 4 × 𝑎−2,0 − 10 × 𝑎−1,0 + 57 × 𝑎0,0 + 19 × 𝑎1,0 − 7 × 𝑎2,0 + 4 × 𝑎3,0 − 1 × 𝑑14,0 ≫ 𝑠𝑖𝑓𝑡2
Interpolation filter – Example template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
Interpolation filter – Example. Half-pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -11 40 40 -11 4 -1
Example) 1/2 position b0,0
8-tap separable DCTIF coefficient of 1/2 position isFrist = true; isLast = true; (uni-direction case) shift = 6; offset = 1<<(6-1) = 32; maxVal = 1023(HE), 255(LC); cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e0,0
2D separable interpolation (1) Horizontal filtering isFrist = true; isLast = false; shift = 2(HE), 0(LC); offset = -(1<<7); maxVal = 0; cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter – Example. Quarter pel template<int N, bool isVertical, bool isFirst, bool isLast> Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff) { Int row, col; Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering) src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma) Int offset; Short maxVal; Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC) Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6 // isFirst: whether first filtering or not if ( isLast ) { // last filtering shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC) offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC) offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC) maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC) } else { // other case shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC) offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC) maxVal = 0; } for (row = 0; row < height; row++) { for (col = 0; col < width; col++) { Int sum; sum = src[ col + 0 * cStride] * coeff[0]; sum += src[ col + 1 * cStride] * coeff[1]; sum += src[ col + 2 * cStride] * coeff[2]; sum += src[ col + 3 * cStride] * coeff[3]; sum += src[ col + 4 * cStride] * coeff[4]; sum += src[ col + 5 * cStride] * coeff[5]; sum += src[ col + 6 * cStride] * coeff[6]; sum += src[ col + 7 * cStride] * coeff[7]; Short val = ( sum + offset ) >> shift; if ( isLast ) { // clipping in last filtering val = ( val < 0 ) ? 0 : val; val = ( val > maxVal ) ? maxVal : val; } dst[col] = val; // store filtering output pixel } src += srcStride; dst += dstStride; } } modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e0,0
2D separable interpolation (2) Vertical filtering isFrist = false; isLast = true; shift = 10(HE), 12(LC); offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC); maxVal = 1023(HE), 255(LC); cStirde = srcStride; (vertical filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
THANK YOU
top related