coding for virtual and augmented reality -...
TRANSCRIPT
![Page 1: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/1.jpg)
Coding for Virtual and Augmented Reality
Philip A. Chou, Microsoft ResearchPacket Video Workshop, 15 July 2016
![Page 2: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/2.jpg)
VR puts you in a Virtual World
![Page 3: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/3.jpg)
AR puts virtual objects in your world
![Page 4: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/4.jpg)
HMDs for Virtual Reality are Enclosed
Cardboard VR Gear VR Daydream
Playstation VRHTC ViveOculus Rift
![Page 5: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/5.jpg)
HMDs for Augmented Reality are See-Through
HoloLens Meta Daqri
![Page 6: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/6.jpg)
Mixed Reality: from AR to VR
Paul Milgram, Haruo Takemura, Akira Utsumi, Fumio KishinoAugmented reality: a class of displays on the reality-virtuality continuum Proc. SPIE Telemanipulator and Telepresence Technologies, Dec 1995
![Page 7: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/7.jpg)
Eventual Merging of VR and AR
• Video-see-through will be on VR devices
• Full immersion (high contrast and FOV) will be on AR devices
![Page 8: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/8.jpg)
Redirected Walking in VR
Razzaque, Kohn, Whitten (2001)
![Page 9: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/9.jpg)
“Cinematic” Content for VR and AR
VR ARVR AR
![Page 10: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/10.jpg)
VR Capture: “Inside-Out”
Lytro Immerge
Samsung Beyond
Nokia Ozo
GoPro Odyssey
Jaunt VR
Facebook Surround
![Page 11: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/11.jpg)
AR Capture: “Outside-In”
![Page 12: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/12.jpg)
![Page 13: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/13.jpg)
![Page 14: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/14.jpg)
![Page 15: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/15.jpg)
1G: TELEPHONE (1876) 3G: HOLOPORTATION (2016)2G: TELEVISION (1926)
Immersive communication is the real time exchange of the natural social signals between people who are geographically separated, as if they were in co-located.
50 years 90 years
VR/AR: Third Generation of Immersive Communication
![Page 16: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/16.jpg)
VR/AR: Fourth Generation Computing Platform(after PC, web, and mobile)
http://www.digi-capital.com/news/2016/01/augmentedvirtual-reality-revenue-forecast-revised-to-hit-120-billion-by-2020/
http://www.goldmansachs.com/our-thinking/pages/technology-driving-innovation-folder/virtual-and-augmented-reality/report.pdf
Goldman Sachs: $80B by 2025
![Page 17: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/17.jpg)
Coding
![Page 18: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/18.jpg)
Degree of Motion Parallax determines Coding
Virtual Reality (panoramic)• Motion Parallax is limited because
of limited movement• Requires yaw, pitch, roll but not
(much) x, y, z translation• Stationary camera with 360° view
makes sense (inside-out)• Representation can be spherical
video (possibly stereo, depth)• Coding: map spherical video to
rectangular video
Augmented Reality (volumetric)• Motion Parallax is unlimited
because roaming is allowed• Requires x, y, z translation as well
as yaw, pitch, roll (6DoF)• Object capture makes sense
(outside-in)• Representation must be light field,
mesh, point cloud, etc.• Coding: novel compression
techniques
![Page 19: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/19.jpg)
Omnidirectional Mappings
Equirectangular projection
- Simple, popular
- Large horizontal oversampling near poles
Equal-area projection- Decreased vertical sampling near
poles
Cube Map- Popular in graphics
community
Courtesy Hari Lakshman and MPEG/JPEG
![Page 20: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/20.jpg)
Omnidirectional Mapping Application Framework (OMAF)• MPEG activity
• Phases
• Timeline
OMAF v1 (2017)
• Basic framework
OMAF v2 (2018)
• 3D Audio metadata
• New projection
• AR
OMAF v3 (2020)
• New video coding
![Page 21: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/21.jpg)
Issues for VR
• Mapping
• Audio
• Video coding may need adjustment (e.g., motion comp)
• Evaluation criteria
• Streaming on demand• Directional or spatial random access, ROI coding
• Interactivity (much more rapid than trick modes)
• Latency
• Broadcasting
![Page 22: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/22.jpg)
Pipeline for Video / VR / AR
capture fuse encode decode render display
Video Video Digital video Digital video Video
VR Array of video
Digital video(may be stereo, may have depth)
Digital video(may be stereo, may have depth)
• Browser• Phone, tablet app• Low-end VR device (e.g., Cardboard VR, Gear VR)• High-end VR device (e.g., Oculus Rift, HTC Vive, Playstation VR)
AR Array of video + depth
Volumetricrepresentation
Volumetricrepresentation
• Browser• Phone, tablet app• Low-end VR device (e.g., Cardboard VR, Gear VR)• High-end VR device (e.g., Oculus Rift, HTC Vive, Playstation VR)• High-end AR device (e.g., HoloLens)
![Page 23: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/23.jpg)
Volumetric Representations
• At present, not even the representation is agreed upon
• Some candidates:• Dynamic Meshes
• Advantages: typical pipeline, easy to interpolate as it represents a surface• Disadvantages: does not handle noise well, or non-surfaces
• Dynamic Point Clouds• Advantages: easily processed in parallel, handles noise and non-surfaces• Disadvantages: hard to interpolate
• Hybrids• Light Fields• Holograms• Dense volumetric functions and implicit surfaces
![Page 24: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/24.jpg)
Meshes and Point Clouds: Irregular Domains
![Page 25: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/25.jpg)
Graph Signal Processing (GSP):Framework for Signals on Irregular Domains
• Generalizes processing of real-valued signals defined on a regular domain (such as an image grid) to real-valued signals defined on a discrete graph.
• Applications to networks: social, sensor, communication, energy, transportation, neuronal; and geometry.
• Generalizes linear transform, impulse response, shift invariance, spectrum, frequency response, filtering, smoothing, interpolation, denoising, convolution, sampling, translation, etc.
Reference: Shuman, Narang, Frossard, Ortega, and Vendergheynst, “Signal Processing on Graphs,” IEEE Signal Processing Magazine, May 2013.
𝑣 ∈ 𝑉
𝑥(𝑣)
Domain 𝑉
Graph (𝑉, 𝐸)
Signal 𝑥
![Page 26: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/26.jpg)
Adjacency Matrix as Shift Operator
• A graph (𝑉, 𝐸) is represented by an (optionally weighted) adjacency matrix𝐴 = 𝑎𝑚𝑛 , with 𝑎𝑚𝑛 > 0 being the weight on edge 𝑚, 𝑛 ∈ 𝐸.
Example 1Left Shift
operator on N points on a circle𝐴 =
0 1 0 00 0 ⋱ 00 0 0 11 0 0 0
Eigenvectors are 𝑒𝑖𝑚𝑛/𝑁
since 𝐴 𝑒𝑖𝑚𝑛/𝑁 =
𝑒𝑖𝑚𝑛/𝑁 𝑑𝑖𝑎𝑔(𝑒𝑖𝑛/𝑁)
Example 2Symmetric Random Walk
operator on N points on a circle𝐴 =
1
2
0 1 0 11 0 ⋱ 00 ⋱ 0 11 0 1 0
0 50 100 150-0.2
0
0.2
A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs,” IEEE Transactions on Signal Processing, vol. 61, no. 7, pp.1644-1656, 2013
![Page 27: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/27.jpg)
Linear Shift Invariant operators and the GFT
• {LSI operators} ⊇ {analytic functions 𝑓(𝐴)} ⊇ {rational functions 𝑞−1 𝐴 𝑝(𝐴)} ⊇ {polynomials 𝑝 𝐴 = 𝑝0𝐼 + 𝑝1𝐴 +⋯+ 𝑝𝑀𝐴
𝑀}• 𝑓 𝐴 𝐴 = ∑𝑝𝑚𝐴
𝑚 𝐴 = ∑𝑝𝑚𝐴𝑚+1 = 𝐴(∑𝑝𝑚𝐴
𝑚) = 𝐴𝑓 𝐴
• If shift operator 𝐴 has eigenvectors Ψ and eigenvalues Λ (i.e., 𝐴Ψ = ΨΛ)then any LSI operator 𝑓(𝐴) has eigenvectors Ψ and eigenvalues 𝑓 Λ(i.e., 𝑓 𝐴 Ψ = Ψ𝑓 Λ )• 𝐴 = ΨΛΨ−1
• 𝑓 𝐴 = ∑𝑝𝑚𝐴𝑚 = ∑𝑝𝑚 ΨΛΨ−1 𝑚 = Ψ∑𝑝𝑚Λ
𝑚Ψ−1 = Ψ𝑓 Λ Ψ−1
Therefore 𝑓 𝐴 𝑥 = Ψ𝑓 Λ Ψ−1𝑥.
• Ψ−1𝑥 is known as the Graph Fourier Transform of 𝑥.
![Page 28: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/28.jpg)
Graph Laplacian
• Often shift 𝐴 is stochastic (rows or columns sum to 1)• 𝐿 = 𝐼 − 𝐴 is the Graph Laplacian (same eigenvectors as 𝐴, eigenvalues 𝐼 − Λ)
• More generally• 𝐿 = 𝐷 − 𝐴 is the Graph Laplacian (𝐷 = 𝑑𝑖𝑎𝑔(𝑑𝑖), 𝑑𝑖 = ∑𝑗 𝐴𝑖𝑗)
• Normalizations• 𝐿 = 𝐼 − 𝐷−1𝐴 is the “Random Walk” Graph Laplacian
• 𝐿 = 𝐼 − 𝐷−1
2𝐴𝐷−1/2 is the “Normalized” Graph Laplacian
• Frequently 𝐿 = ΨΛΨ−1 is the starting point instead of 𝐴, and real PSD• Eigenvalues 𝑑𝑖𝑎𝑔(Λ) are real non-negative (the graph spectrum)
![Page 29: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/29.jpg)
Mesh as a Graph
![Page 30: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/30.jpg)
Eigenvectors of a Mesh
![Page 31: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/31.jpg)
Eigenvalues (Spectrum) as “Frequencies”
0 2000 4000 6000 8000 10000 12000 140000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5x 10
4 Zero-crossings of eigenvectors
0 2000 4000 6000 8000 10000 12000 14000-2
0
2
4
6
8
10
12Eigenvalues of L
![Page 32: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/32.jpg)
Spectrum of Geometry as a Signal
0 2 4 6 8 10 12-500
-400
-300
-200
-100
0
100
200
300
400
500Spectrum of Z component as a signal
0 2 4 6 8 10 12-500
-400
-300
-200
-100
0
100
200
300
400
500Spectrum of Y component as a signal
0 2 4 6 8 10 12-500
-400
-300
-200
-100
0
100
200
300
400
500Spectrum of X component as a signal
![Page 33: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/33.jpg)
Energy Compaction100% of coefficients 50% of coefficients 10% of coefficients 2% of coefficients
![Page 34: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/34.jpg)
Spectral Domain Filtering / Denoising
![Page 35: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/35.jpg)
Spectral Domain Filtering / Denoising
![Page 36: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/36.jpg)
Impulse Response is Localized
![Page 37: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/37.jpg)
Polynomials and Vertex-Domain Filtering
• Spectral filters do not generally have compact vertex-domain support.
• Spectral filters 𝑓(𝜆) correspond to LSI transforms 𝑓 𝐿 = Ψ𝑓 Λ Ψ𝑇 .
• Polynomial spectral filters have compact support• 𝑦 = 𝑝 𝐿 𝑥 = ∑𝑚=0
𝑀 𝑝𝑚𝐿𝑚 𝑥
• If 𝑥 𝑖 = 𝛿(𝑖, 𝑖0) then 𝑦 𝑖 ≠ 0 only if 𝑑 𝑖, 𝑖0 ≤ 𝑀, where 𝑑 is geodesic distance
• Polynomial filters can be efficiently evaluated in vertex domain• Evaluate 𝐿𝑥 by message passing, then 𝐿2𝑥, …, and lastly 𝐿𝑀𝑥.
• Finally sum ∑𝑚=0𝑀 𝑝𝑚𝐿
𝑚 𝑥. All can be done in parallel (e.g., on GPU).
![Page 38: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/38.jpg)
Prediction and Interpolation
• Problem: Find 𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛such that
𝑥 =𝑥𝑘𝑛𝑜𝑤𝑛𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛
minimizes the norm
𝐿 Τ1 2𝑥2= 𝑥𝑇𝐿𝑥
of the high-pass signal 𝑦 = 𝐿1/2𝑥(using high-pass filter 𝑓 𝜆 = 𝜆1/2)
• Solution:𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛 = 𝐿22
−1𝐿21𝑥𝑘𝑛𝑜𝑤𝑛• Remark:
Same as 𝐸 𝑋𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑋𝑘𝑛𝑜𝑤𝑛 = 𝑥𝑘𝑛𝑜𝑤𝑛if 𝑋 ∼ 𝑁(0, 𝐿−1)
Previous Frame
Front Back
![Page 39: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/39.jpg)
Prediction and Interpolation
• Problem: Find 𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛such that
𝑥 =𝑥𝑘𝑛𝑜𝑤𝑛𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛
minimizes the norm
𝐿 Τ1 2𝑥2= 𝑥𝑇𝐿𝑥
of the high-pass signal 𝑦 = 𝐿1/2𝑥(using high-pass filter 𝑓 𝜆 = 𝜆1/2)
• Solution:𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛 = 𝐿22
−1𝐿21𝑥𝑘𝑛𝑜𝑤𝑛• Remark:
Same as 𝐸 𝑋𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑋𝑘𝑛𝑜𝑤𝑛 = 𝑥𝑘𝑛𝑜𝑤𝑛if 𝑋 ∼ 𝑁(0, 𝐿−1)
Current Frame
Front Back
![Page 40: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/40.jpg)
Prediction and Interpolation
• Problem: Find 𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛such that
𝑥 =𝑥𝑘𝑛𝑜𝑤𝑛𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛
minimizes the norm
𝐿 Τ1 2𝑥2= 𝑥𝑇𝐿𝑥
of the high-pass signal 𝑦 = 𝐿1/2𝑥(using high-pass filter 𝑓 𝜆 = 𝜆1/2)
• Solution:𝑥𝑢𝑛𝑘𝑛𝑜𝑤𝑛 = 𝐿22
−1𝐿21𝑥𝑘𝑛𝑜𝑤𝑛• Remark:
Same as 𝐸 𝑋𝑢𝑛𝑘𝑛𝑜𝑤𝑛 𝑋𝑘𝑛𝑜𝑤𝑛 = 𝑥𝑘𝑛𝑜𝑤𝑛if 𝑋 ∼ 𝑁(0, 𝐿−1)
Warp of Previous to Current Frame
Front Back
![Page 41: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/41.jpg)
Smoothing
• Smoothing is similar, with a loose constraint:
Find 𝑥 =𝑥1𝑥2
minimizing 𝑥1 − 𝑥𝑘𝑛𝑜𝑤𝑛2 + 𝛼𝑥𝑇𝐿𝑥
• Special case:
Find 𝑥 minimizing 𝑥 − 𝑥𝑘𝑛𝑜𝑤𝑛2 + 𝛼𝑥𝑇𝐿𝑥
• Solution:
𝑥∗ =𝐼 00 0
+ 𝛼𝐿−1 𝑥𝑘𝑛𝑜𝑤𝑛
0(equivalent to low-pass Tikhonov filteringwith 𝑓 𝜆 =
1
1+𝛼𝜆in special case)
![Page 42: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/42.jpg)
Perfect Reconstruction Critically Sampled2-Channel Filter Banks with Compact SupportNarang and Ortega, “Compact Support Biorthogonal Wavelet Filterbanks for Arbitrary Undirected Graphs,” TSP 2012
Graph Wavelet with Compact Support
𝐽𝛽 = 𝑑𝑖𝑎𝑔 𝛽𝑖 , 𝛽𝑖 = +1/−1 if 𝑖 ∈ 𝐿/𝐻
𝐼 =1
2𝐺0 𝐼 + 𝐽𝛽 𝐻0 +
1
2𝐺1 𝐼 − 𝐽𝛽 𝐻1
=1
2𝐺0𝐻0 + 𝐺1𝐻1 +
1
2𝐺0𝐽𝛽𝐻0 − 𝐺1𝐽𝛽𝐻1
Design for Bipartite Graph
• 𝜓2−𝜆 = 𝐽𝛽𝜓𝜆 (spectral folding)
• Perfect Reconstruction if• 𝑔0 𝜆 ℎ0 𝜆 + 𝑔1 𝜆 ℎ1 𝜆 = 2
• 𝑔0 𝜆 ℎ0 2 − 𝜆 − 𝑔1 𝜆 ℎ1 2 − 𝜆 = 0
𝑔0 𝜆 = ℎ1(2 − 𝜆) 𝑔1 𝜆 = ℎ0(2 − 𝜆)
½(𝐼 + 𝐽𝛽)
½(𝐼 − 𝐽𝛽)
![Page 43: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/43.jpg)
Multi-Resolution Processing
Iterate Low-Pass Filtering Procedure
• At level 𝐿, use some rule to color graph with only two colors( = low pass, = high pass)• Easy if graph is bi-partite• Otherwise separate graph into union
of bi-partite graphs
• Filter to get coeffs at L/H vertices
• At level 𝐿 − 1, use some rule to reconnect low pass vertices
• Repeat
![Page 44: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/44.jpg)
Application to Mesh Geometry and Color CompressionNguyen, Chou, and Chen, “Compression of Human Body Sequences Using Graph Wavelet Filter Banks,” ICASSP 2014Anis, Chou, and Ortega, “Compression of Dynamic 3D Point Clouds using Subdivisional Meshes and Graph Wavelet Transforms,” ICASSP 2016
![Page 45: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/45.jpg)
Quad Mesh Subdivision
Coarse quad mesh
![Page 46: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/46.jpg)
Quad Mesh Subdivision
![Page 47: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/47.jpg)
Quad Mesh Subdivision
![Page 48: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/48.jpg)
Quad Mesh Subdivision
![Page 49: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/49.jpg)
Quad Mesh Subdivision
Target quad mesh
![Page 50: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/50.jpg)
Quad Mesh Subdivision
0
0 0
0
0
0
0
Lowpass subband
Highpass subband
![Page 51: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/51.jpg)
Quad Mesh Subdivision
0
0 0
0
0
0
0
1
1
1
Lowpass subband
Highpass subband
![Page 52: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/52.jpg)
Quad Mesh Subdivision
0
0 0
0
0
0
0
1
1
1
2
2
22
22
2
2
2
Lowpass subband
Highpass subband
![Page 53: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/53.jpg)
Quad Mesh Subdivision
0
0 0
0
0
0
0
1
1
1
2
2
22
22
2
2
2
3
3 3
3
3
3
3
333
3 3
Lowpass subband
Highpass subband
![Page 54: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/54.jpg)
Quad Mesh Subdivision
0
0 0
0
0
0
0
1
1
1
2
2
22
22
2
2
2
3 3
33
3 3
3
3
3
3
334 4
4
4
44
4
4 4
44
4
4
4
4
4
4
4
44
44
4
4
4 4
44
4
4
Lowpass subband
Highpass subband
![Page 55: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/55.jpg)
0
0 0
0
0
0
Tri Mesh Subdivision
Lowpass subband
Highpass subband
![Page 56: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/56.jpg)
0
0 0
0
0
0
1
1
1
1
1
11
1
1
1
Lowpass subband
Highpass subband
Tri Mesh Subdivision
![Page 57: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/57.jpg)
0
0 0
0
0
0
1
1
1
1
1
11
1
1
12 2
2
2
2
2
2
2
2
2
2
2
22
22
2
2
2 2
22
2
422
2
2
2
2 2
2
22
2
222
Lowpass subband
Highpass subband
Tri Mesh Subdivision
![Page 58: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/58.jpg)
Hybrid Predictive-Transform Coding
• Graph transform, uniform scalar quantization, entropy coding
• Temporal prediction removes the temporal redundancy, creates motion vectors
• Graph transform removes spatial correlation among the motion vectors
𝑇 𝑇−1
![Page 59: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/59.jpg)
Context-Adaptive Arithmetic Coding
• Let 𝑛 = 0,1,2,3,4 be the subband index
• If coefficient at node 𝑖 is in subband 𝒱𝑛, model it as Laplacian with parameter
𝜆𝑖 = 𝛽𝑛,0 +
𝑗=1
𝑛−1
𝛽𝑛,𝑗∑𝑘∈𝒩𝑖∩𝒱𝑛−𝑗
𝑥𝑘
|𝒩𝑖 ∩ 𝒱𝑛−𝑗|
−1
depending on average of magnitudes |𝑥𝑘| of neighbors 𝑘 ∈ 𝒩𝑖 ∩ 𝒱𝑛−𝑗 in previous subbands
• Fit constants {𝛽𝑛,𝑗} to data
0
0 0
0
0
0
0
1
1
1
2
2
22
22
2
2
2
3 3
33
3 3
3
3
3
3
334 4
4
4
44
4
4 4
44
4
4
4
4
4
4
4
44
44
4
4
4 4
44
4
4
![Page 60: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/60.jpg)
Rate-distortion Performance
Geometry Color
![Page 61: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/61.jpg)
Application to Point Cloud Geometry and Color CompressionThanou, Chou, and Frossard, “Graph-based compression of dynamic 3D point cloud sequences,” TIP 2015Queiroz and Chou, “Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform,” TIP 2016Queiroz and Chou, “Motion-Compensated Compression of Dynamic Voxelized Point Clouds,” TIP 2016
![Page 62: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/62.jpg)
Application to Point Cloud Geometry and Color CompressionThanou, Chou, and Frossard, “Graph-based compression of dynamic 3D point cloud sequences,” TIP 2015Queiroz and Chou, “Compression of 3D Point Clouds Using a Region-Adaptive Hierarchical Transform,” TIP 2016Queiroz and Chou, “Motion-Compensated Compression of Dynamic Voxelized Point Clouds,” TIP 2016
![Page 63: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/63.jpg)
Octree Coding for (Static) Geometry
10010001
10010001 11001001 10010001
![Page 64: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/64.jpg)
Octree Coding for Color?
221,136,255
255,153,255 255,102,255 153,153,255
![Page 65: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/65.jpg)
Quantization, Entropy Coding, and Transmission
Haar Butterfly
𝑔3,0
ℎ2,0𝑔2,0
𝑔3,1 𝑔3,2 𝑔3,3 𝑔3,4 𝑔3,5 𝑔3,6 𝑔3,7
ℎ2,1𝑔2,1 ℎ2,2
𝑔2,2 ℎ2,3𝑔2,3
ℎ1,0𝑔1,0 ℎ1,1
𝑔1,1
ℎ0,0𝑔0,0
1/√2
1/√21/√2 −1/√2
−1/√21/√21/√2
1/√2
1/√21/√2
1/√2−1/√2
ℎ0,0ො𝑔0,01/√2
−1/√21/√2
1/√2ℎ1,0ො𝑔1,0 ℎ1,1ො𝑔1,1
ℎ2,0ො𝑔2,0 ℎ2,1ො𝑔2,1 ℎ2,2ො𝑔2,2 ℎ2,3ො𝑔2,3
ො𝑔3,0 ො𝑔3,1 ො𝑔3,2 ො𝑔3,3 ො𝑔3,4 ො𝑔3,5 ො𝑔3,6 ො𝑔3,7
𝑔𝑙,𝑘ℎ𝑙,𝑘
=
1
2
1
2−1
2
1
2
𝑔𝑙+1,2𝑘𝑔𝑙+1,2𝑘+1
ො𝑔𝑙+1,2𝑘ො𝑔𝑙+1,2𝑘+1
=
1
2
−1
21
2
1
2
ො𝑔𝑙,𝑘ℎ𝑙,𝑘
Tran
sfo
rmIn
vers
e
Tran
sfo
rm
![Page 66: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/66.jpg)
Haar Tree
𝑔3,0
𝑔2,0
𝑔3,1 𝑔3,2 𝑔3,3 𝑔3,4 𝑔3,5 𝑔3,6 𝑔3,7
ℎ2,0
ℎ0,0
ො𝑔3,0 ො𝑔3,1 ො𝑔3,2 ො𝑔3,3 ො𝑔3,4 ො𝑔3,5 ො𝑔3,6 ො𝑔3,7
Quantization, Entropy Coding, and Transmission
𝑔𝑙,𝑘ℎ𝑙,𝑘
=
1
2
1
2−1
2
1
2
𝑔𝑙+1,2𝑘𝑔𝑙+1,2𝑘+1
ො𝑔𝑙+1,2𝑘ො𝑔𝑙+1,2𝑘+1
=
1
2
−1
21
2
1
2
ො𝑔𝑙,𝑘ℎ𝑙,𝑘
Tran
sfo
rmIn
vers
e
Tran
sfo
rm
𝑔2,1 ℎ2,1 𝑔2,2 ℎ2,2 𝑔2,3 ℎ2,3
𝑔1,0 ℎ1,0 𝑔1,1 ℎ1,1
𝑔0,0 ℎ0,0
ℎ1,0ො𝑔1,0 ℎ1,1ො𝑔1,1
ℎ2,0ො𝑔2,0 ℎ2,1ො𝑔2,1ℎ2,2ො𝑔2,2 ℎ2,3ො𝑔2,3
ො𝑔0,0
1/ 2 𝑔𝑙,𝑘
ℎ𝑙,𝑘
(𝑔𝑙+1,2𝑘, 𝑔𝑙+1,2𝑘+1)
1/ 2−1/ 2
![Page 67: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/67.jpg)
Region Adaptive Haar Transform (RAHT)
𝑔3,0
𝑔2,0
𝑔3,1 𝑔3,3 𝑔3,6
ℎ2,0
ℎ0,0
ො𝑔3,0 ො𝑔3,1 ො𝑔3,3 ො𝑔3,6
Quantization, Entropy Coding, and Transmission
𝑔𝑙,𝑘ℎ𝑙,𝑘
=𝑎 𝑏−𝑏 𝑎
𝑔𝑙+1,2𝑘𝑔𝑙+1,2𝑘+1
ො𝑔𝑙+1,2𝑘ො𝑔𝑙+1,2𝑘+1
=𝑎 −𝑏𝑏 𝑎
ො𝑔𝑙,𝑘ℎ𝑙,𝑘
Tran
sfo
rmIn
vers
e
Tran
sfo
rm
𝑔2,1 𝑔2,3
𝑔1,0 ℎ1,0 𝑔1,1
𝑔0,0 ℎ0,0
ℎ1,0ො𝑔1,0 ො𝑔1,1
ℎ2,0ො𝑔2,0 ො𝑔2,1 ො𝑔2,3
ො𝑔0,0
𝑎 =𝑤𝑙+1,2𝑘
𝑤𝑙+1,2𝑘 + 𝑤𝑙+1,2𝑘+1
𝑤3,0 = 1 𝑤3,1 = 1 𝑤3,3 = 1 𝑤3,6 = 1
𝑤2,0 = 2
𝑤1,0 = 3
𝑤0,0 = 4
𝑤2,1 = 1 𝑤2,3 = 1
𝑤0,0 = 4
𝑤1,0 = 3
𝑤1,1 = 1
𝑤1,1 = 1
𝑤2,0 = 2 𝑤2,1 = 1 𝑤2,3 = 1
𝑤3,0 = 1 𝑤3,1 = 1 𝑤3,3 = 1 𝑤3,6 = 1𝑏 =
𝑤𝑙+1,2𝑘+1𝑤𝑙+1,2𝑘 + 𝑤𝑙+1,2𝑘+1
𝑎2 + 𝑏2 = 1
𝑎
𝑏
−𝑏 𝑎
(𝑔𝑙+1,2𝑘, 𝑔𝑙+1,2𝑘+1)
𝑔𝑙,𝑘ℎ𝑙,𝑘
Scaled DC: 𝑤𝑙+1,2𝑘 + 𝑤𝑙+1,2𝑘+1 𝑔𝑙,𝑘 = 𝑤𝑙+1,2𝑘 𝑔𝑙+1,2𝑘 + 𝑤𝑙+1,2𝑘+1 𝑔𝑙+1,2𝑘+1
![Page 68: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/68.jpg)
Tree specifies bi-partite graph
• Tree is a rule• At level 𝐿, how to connect and color nodes to form a bi-partite graph
• At level 𝐿 − 1, how to reconnect low-pass nodes into a bi-partite graph
• Could be the way to generalize beyond 2-tap Haar using GSP
𝑔3,0
𝑔2,0
𝑔3,1 𝑔3,3 𝑔3,6
ℎ2,0 𝑔2,1 𝑔2,3
𝑔1,0 ℎ1,0 𝑔1,1
𝑔0,0 ℎ0,0
Lowpass subband
Highpass subband
![Page 69: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/69.jpg)
Feedforward Adaptive Arithmetic Coding
• Each final transformed data with the same weight is grouped in the same subband
• Typically we have about 1000 subbands
• Each subband is encoded using Arithmetic coding• Probabilities according to Laplacian distribution
• Both encoder and decoder have to agree on the standard deviation 𝜆
• For each subband 𝑛, we send 𝜆𝑛• 𝜆𝑛 are optimally quantized and encoded using Run-Length Golomb-
Rice
![Page 70: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/70.jpg)
RAHT Results (1)
Santa Octopus
Comparison to Huang, Peng, Kuo,and Gopi, “A generic scheme forprogressive point cloud coding,”IEEE Trans. Vis. Comput. Graph., 2008
![Page 71: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/71.jpg)
Results RAHT (2) Comparison to Zhang, Florencio, and Loop, “Point cloud attribute compression with graph transform,” ICIP 2014
![Page 72: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/72.jpg)
Results RAHT (3)
Andrew Phil
Ricardo Sarah
Comparison to Zhang, Florencio, and Loop, “Point cloud attribute compression with graph transform,” ICIP 2014
![Page 73: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/73.jpg)
Compression of Dynamic Point Clouds
Dorina Thanou, Philip A. Chou, and Pascal Frossard, Graph-Based Motion Estimation and Compensation for Dynamic 3D Point Cloud Compression, IEEE Trans. Image Processing, Apr. 2016
Ricardo L. de Queiroz and Philip A. Chou, Motion-Compensated Compression of Dynamic Voxelized Point Clouds, IEEE Trans. Image Processing, submitted 2016
RAHT 2.6 bpv MCIC 2.6 bpv(Intra coded) (Block motion comp)
Sparse motion estimation;Dense (interpolated) motion compensation
![Page 74: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/74.jpg)
Rough sense of overall bitrates
• 2 Gbps: barely coded (holoportation demo) – 1 Mpixel color
• 450 Mbps: Intra-frame only, octree for geometry, uncoded 8:1:1 YUV
• 150 Mbps: Intra-frame only, octree for geometry, RAHT for YUV
• 45 Mbps: Inter-frame, hybrid (real time)
• 15 Mbps: poorly-coded mesh + H264-coded texture (siggraph paper, not real time)
• 8 Mbps: well-coded mesh + H265-coded texture
• 6 Mbps: well-coded mesh + periodic texture refresh
![Page 75: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/75.jpg)
Future Issues
• Better motion estimation and compensation for point clouds
• Perceptually relevant distortion measures
• Hybrid mesh / point cloud coding – best of both
• Hybrid AR/VR coding – outside-in + inside-out
• Scalable coding• Spatial as well as temporal random access• Object scalability, spatial resolution scalability, signal resolution (SNR) scalability
• Rate-distortion optimization
• Error resilience
• Rate control, buffer management
• Streaming architecture• Low-end vs high-end devices and rendering location
![Page 76: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/76.jpg)
Conclusion
• VR/AR are 3G immersive comm, 4G platform, to be ~$100 billion / year
• VR is closer in deployment and technology (similar to video coding); AR is later but may subsume VR (will need new coding tools)
• GSP provides an extension of classic tools necessary to process AR signals
• AR coding analogous to video coding circa 1980: coding paradigm not commonly agreed upon. History of video coding gives us a roadmap for development of AR coding.
![Page 77: Coding for Virtual and Augmented Reality - struga.orgstruga.org/Jakov/Misc/PacketVideo2016KeynotePhilChou.pdfCoding for Virtual and Augmented Reality Philip A. Chou, Microsoft Research](https://reader031.vdocuments.mx/reader031/viewer/2022021820/5adfb1967f8b9afd1a8d0c48/html5/thumbnails/77.jpg)
ThanksSpecial thanks go to my interns Ha Nguyen, Dorina Thanou, Amir Anis, and Eduardo Pavel Carvelli; visiting researcher Ricardo de Queiroz; colleagues Dinei Florencio, Cha Zhang, Charles Loop, Qin Cai, and the I3D group; and university collaborators Pascal Frossard, Antonio Ortega, Gene Cheung, and Minh Do.