high dynamic range emeka ezekwe m11 christopher thayer m12 shabnam aggarwal m13 charles fan m14...
Post on 21-Dec-2015
222 views
TRANSCRIPT
High Dynamic Range
Emeka Ezekwe M11
Christopher Thayer M12Shabnam Aggarwal M13Charles Fan M14
Manager: Matthew Russo04/18/23
1
Agenda2
Project Description Charles Marketing Shabnam Behavioral Description Emeka Design Process Chris Floorplan Evolution Shabnam Design Specifications Chris Layout Charles Conclusion Emeka
Project Description4
High Dynamic Range?? Bright colors are BRIGHT Dark colors are DARK Details are seen CLEARLY
Otherwise… Colors and lights look distorted & bland
FP HDR Format requires 48 bits per pixel Problem: Too much storage space & memory
bandwidth!! Solution: HDR encoding yields 6:1 compression
OUR GOAL: Implement efficient HDR decoding in hardware
6:1 pixel compression Increases useable storage space by 6 fold decrease memory bandwidth by 6 fold Effectively increases performance
Marketing7
AMD’s ATI Mobility Radeon X1900 48-bit floating point HDR
HDR Compression is currently NOT supported Performance hit deters developers
Windows Vista also now requires a high end GPU to realize its full graphics potential. Laptops & portable devices are using
dedicated processors for graphics
OLED (Organic Light Emitting Diode) Displays are being developed by Sony Contrast Ratio: 1000000:1
Marketing9
Our decoder is designed to interface between specially encoded textures stored on the GPU’s memory and one of the GPU’s texture caches that feed into the shader processor.
Each ROP on (**ATI) is capable of processing 4 pixels per clock cycle. We plan for our hardware to decode the texture information for 4 pixels during each clock cycle.
This decoder will allow smaller textures to be stored in the GPU’s memory, which will allow graphics cards to provide the same functions with less memory.
Ultimately, this decoder can provide savings in cost, power consumption, heat dissipation, and size in current graphics cards.
Our HDR Decoder!!
Marketing10
Our HDR Decoder: Smaller textures stored in GPU’s memory Same functions…less memory
Savings in: Cost Power consumption Heat dissipation Size
HDR is the next generation of display technology
Algorithmic Description
Encoding Break texture into 4X4 pixel blocks. Extract luminance value of each pixel. Normalize red and blue values and average
over each 2X2 block. Green can be recalculated while decoding.
Allocate more bits to luminance values. After encoding, a 4X4 block of pixels can be
compressed from 48 bpp to 8 bpp.
Algorithmic Description
Decoding (Luminance values) Reconstruct Lp
1 Logical shift 1 Integer addition
Calculate GQ 1 Integer addition
Calculate final pixel values 3 floating-point multiplications
Total calculations 1 logical shift + 2 Integer additions + 3
floating-point multiplications
Data Flow1414
Find GReg
Reg
Reg
Reg
Reg
Reg
7
7
4
4
4
4
8
Reg
Compute 1 pixel
Compute 1 pixel
Compute 1 pixel
Compute 1 pixel
Int to FP
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Reg16
Serializeoutput
Serializeoutput
Serializeoutput
Serializeoutput
Design Process16
Goal: Speed 400 MHz 4 pixels per cycle, 4 cycles per block
Architectural decisions No denormal support in Floating Point Multiplier Pipelined design Storing input values Integer Multiplication
Wallace trees Booth encoding
Critical adders Carry select
Integer- Floating Point Conversion
Circuit level decisions Mirror FA’s to reduce carry-chain delay Two different HA’s AOI/OAI gates Gate sizing along critical paths Utilize Q and ~Q outputs from registers Clock buffers built into register blocks Double/Triple strapped VDD and GND Repeaters to break up long wires Balanced clock tree Device Folding
Design Process
Verification Process18
C Implementation Structural Verilog Gate Level Schematic Layout
Major Modules Pipeline Stages Global Signals
Design Specifications22
Delays Stage one pipeline: 1.8 ns Stage two pipeline: 1.53ns Stage three pipeline: 2.479ns
Skew Stage one: x Stage two: x Stage three: x
Resulting Clock Speed: 500 MHz 2 BILLION pixels per second
Size: 442x453 microns Aspect Ratio: 1:1.024
Transistors: 42,772 Density: 0.21 T/micron^2