computer graphics graphics hardware co2409 computer graphics week 12
TRANSCRIPT
Computer GraphicsGraphics Hardware
Computer GraphicsGraphics Hardware
CO2409 Computer Graphics
Week 12
Lecture ContentsLecture Contents
1. Graphics Architecture
2. Processing: • GPU vs CPU
3. Memory:• Video Memory vs System Memory
4. Hardware Comparisons
5. Motherboard Interface
6. Parallelism / Concurrency
Graphics ArchitectureGraphics Architecture
• The basic graphics architecture for all modern PCs and game consoles is similar:
• Two key parts:– Main System– Graphics Unit
• Local RAM for each processor– Fast access
• Interface between components slow– Compared to
local access
Graphics UnitGraphics Unit
• The graphics unit describes the graphics processing components of a system– PC: usually a separate graphics card– Console: built-in custom components
• The core of a graphics unit is the Graphics Processing Unit (GPU)– Occasionally called a
Graphics Adapter– Equivalent of a CPU for
graphics
PS4 GPU
Graphics UnitGraphics Unit
• The graphics unit also contains:– Local RAM (a.k.a. video memory, VRAM, etc.)– Output sockets for monitor, TV etc.
• Can be integrated into the motherboard– Consoles, some PC’s and
laptops
• Attached via an interface:– PCs use PCI Express
sockets– Can connect several
together (SLI or Crossfire) for more parallelism NVidia GeForce GTX 980 Graphics Cards
[Four connected by SLI]
GPU vs CPUGPU vs CPU
• A GPU is a dedicated graphics processor• Much more parallel than a typical CPU
– i.e. Many more cores (100s or 1000s compared to 4-8 on a CPU)
• Much faster than CPU for graphics algorithms– Particularly vector/matrix operations– But worse at general operations, esp. conditional instructions
• Driven a programmable pipeline (shaders)– Can now perform some general programming using the flexibility
of this – GPGPU programming (General Purpose GPU)
• Power of a GPU is measured by its:– Clock speed– Parallelism. E.g. number of simultaneous shader, texture and
output operations
Graphics MemoryGraphics Memory
• Many graphics units have local memory– To store vertices, primitives and textures for the GPU
• This memory is similar to standard CPU memory– Usually more closely coupled to the GPU– I.e. Bandwidth to the GPU will be greater
• Graphics memory can be measured by:– Clock speed (how fast can it serve data)– Bandwidth to the GPU (clock speed * width of the bus between
graphics memory and the GPU)
• Some GPUs have no dedicated memory– Must share memory with the CPU– This can be a benefit or a penalty…
Graphics vs System MemoryGraphics vs System Memory
• GPUs can usually access system memory– i.e. CPU-local memory
• Bandwidth is often lower– Usually better to have graphics data
in GPU memory
• System memory can be used as a back-up– But expect lag if relied on too heavily
• Some architectures share CPU & GPU memory– Xbox 360 / One / PS4: entire console architecture designed
around this sharing – performance is high– On-board video cards: Lack of GPU memory is a cost-cutting
feature – memory access is slow
Graphics Unit - SpecificationsGraphics Unit - Specifications
• The key specifications for a graphics unit are:– GPU clock speed (MHz ) – speed of processor– Amount of Memory (Mb)– Memory clock speed (MHz / GHz)– GPU<-> Memory Bandwidth (GB / s)
• Speed of transfers between GPU and local memory, determined by memory clock speed and bus width
• Other factors to take into account:– Pixel / Texture Fill Rate (Giga-Pixels / s)
• Speed it can output pixels to screen or textures
– Amount of parallelism in the pipeline• E.g. Number of shader threads/cores – how many vertices / pixels
can be processed at the same time
Historical Comparison (Indicative)Historical Comparison (Indicative)
Platform PS2 XBox Xbox 360 PS3Xbox One
PS4GeForce
GTX 680
GeForce
GTX 980
CPU / GPU Clock (MHz)
294/147
733/233
3 x 3200 / 500
1+7x3200 / 500
8x1750 / 853
8x1600 / 800
- / 1058 - / 1216
CPU / GPU Memory
(Mb)32 / 4
64 (Shared
)
512(Shared)
256 / 256
8000 (Shared)
8000 (Shared)
- / 2048 - / 4096
Memory Clock (MHz) ? 200 700 650 2133
(GDDR3)5500
(GDDR5)6000 7000
Memory Bus (GB/s)
3.2 / 48
6.4 22.420.8 / 22.4
68 / 102x2
176 192 224
Pixel / Texture Fill
Rate (GPx/s)?
? / 0.932
? / 4(No AA)
? / 4(No AA)
12.8 / 40.9
25.6 / 57.6
128.8 144
Parallelism -4 / 2
Shader48
Shader24 / 8
Shader768
Cores1152 Cores
1536 Cores
2048 Cores
GPU / System InterfaceGPU / System Interface
• It is faster to keep graphics data local to the GPU• But the CPU and system memory still need to interface
with the graphics unit:– To get data into the graphics memory – To issue instructions to the GPU (DrawPrimitive)– For dynamic geometry / textures
• So the interface between the graphics unit and the rest of the system is critical
• Games consoles have the graphics unit built into the motherboard and so are closely coupled– E.g. Xbox 360/One, & PS3/4 have fairly symmetrical performance
between GPU, CPU & memory
PC Graphics InterfacesPC Graphics Interfaces
– PCI Express• Uses serial ‘lanes’ to transfer streams of
data in parallel• Version 3 supports 1GB/s transfer rates
(theoretical).• Version 4 specs are being finalised
– Has superseded the earlier interfaces AGP & PCI
– But still can be a bottleneck• (Console memory bus much faster)
• Slow compared to local GPU memory– PCs must rely on more graphics memory
• Interface between system and GPU on PC:
Parallelism / ConcurrencyParallelism / Concurrency
• GPUs are parallel architectures– Processing many pixels / vertices at the same time
• A graphics application is typically a concurrent system– Graphics rendered while main application does other processing
• Concurrent = different tasks performed simultaneously• Parallel = the same task split up and performed simultaneously
• Need to program carefully to get best performance:– Ensure both CPU and GPU working all the time– Neither should be waiting for the other to complete its current task– But watch out for problems with shared data
• Implications about programming graphics applications– Games students will see this in the 3rd year
General Purpose GPUGeneral Purpose GPU
• A GPU is a massively parallel processor– Especially for vector maths operations
• So it can be used for certain non-graphics tasks– Physics simulation, video processing, weather forecasting, etc.– Anything with massive amounts of mathematical calculation– Called General Purpose GPU processing (GPGPU)
• Several APIs for this:– CUDA (from NVidia)
• Extension to C language• Up to 40 times faster than same code on CPU
– OpenCL for ATI and NVidia platforms– Compute Shaders as part of DirectX11