transcendently excellent c++ for games bruce dawson, cpu guy (channeling pete isensee, c++ guy) xna...
TRANSCRIPT
Transcendently Transcendently Excellent C++ for Excellent C++ for GamesGames
Bruce Dawson, CPU guyBruce Dawson, CPU guy(channeling Pete Isensee, C++ Guy)(channeling Pete Isensee, C++ Guy)XNA Developer ConnectionXNA Developer Connection
ThemesThemesNew Visual C++ featuresNew Visual C++ features
ISO C++ complianceISO C++ compliance
Exception handlingException handling
64-bit programming64-bit programming
Game optimization cornucopiaGame optimization cornucopiaSTL and TR1 tipsSTL and TR1 tips
Fast vector mathFast vector math
Pointer aliasingPointer aliasing
State of C++ at MicrosoftState of C++ at MicrosoftC++ is important as ever to MicrosoftC++ is important as ever to Microsoft
Today: Visual Studio 2005Today: Visual Studio 2005Multiple versions: Express to Team Multiple versions: Express to Team EditionEdition
Supports a huge list of platforms: Supports a huge list of platforms: Windows to Xbox 360Windows to Xbox 360
The futureThe futureNext version of Visual Studio: “Orcas”Next version of Visual Studio: “Orcas”
Future version of compiler: “Phoenix”Future version of compiler: “Phoenix”
VC++ 2005 New FeaturesVC++ 2005 New FeaturesVariable number of macro argumentsVariable number of macro arguments
Array element count macroArray element count macro
Static assertionsStatic assertions
Deprecating old functionsDeprecating old functions
OpenMP multiprocessor supportOpenMP multiprocessor support
Static code analysis optionStatic code analysis option
Safe CRTSafe CRT
Checked iteratorsChecked iterators
Throwing newThrowing new
Variadic MacrosVariadic MacrosFrom the C99 ISO StandardFrom the C99 ISO Standard
Pass variable number of macro Pass variable number of macro paramsparams
#ifdef _DEBUG#ifdef _DEBUG
#define LOG( s, #define LOG( s, ...... ) \ ) \
fprintf( LogFile, s, fprintf( LogFile, s, __VA_ARGS____VA_ARGS__ ) )
#else#else
#define LOG( s, #define LOG( s, ...... ) )
#endif#endif
LOG( "Battlestar = %d (name = %s)\n", i, LOG( "Battlestar = %d (name = %s)\n", i,
GetBattlestarName(i) );GetBattlestarName(i) );
_countof Macro_countof MacroReturns number of elements in an Returns number of elements in an arrayarray
Similar to (sizeof(a)/sizeof(a[0]), only Similar to (sizeof(a)/sizeof(a[0]), only safersafer
Compile error for pointer argumentsCompile error for pointer argumentsUses cool template mojoUses cool template mojo
#include <cstdlib>#include <cstdlib>
static int Cylons[] = { DORAL,SIX,CONOY };static int Cylons[] = { DORAL,SIX,CONOY };
const int numCylons = const int numCylons = _countof_countof(Cylons);(Cylons);
char* ptr; // Next line won't compilechar* ptr; // Next line won't compile
const int ptrCount = const int ptrCount = _countof_countof(ptr);(ptr);
STATIC_ASSERT MacroSTATIC_ASSERT MacroCompile-time assertion mechanismCompile-time assertion mechanism
Very useful with template Very useful with template programmingprogramming
#include "CrtDbg.h"#include "CrtDbg.h"
template <class To, class From>template <class To, class From>
To safe_reinterpret_cast( From from )To safe_reinterpret_cast( From from )
{{
_STATIC_ASSERT_STATIC_ASSERT( sizeof(From) <= \( sizeof(From) <= \
sizeof(To) );sizeof(To) );
return reinterpret_cast<To>( from );return reinterpret_cast<To>( from );
}}
Deprecating FunctionsDeprecating FunctionsInform your team that something is Inform your team that something is going away – without breaking the going away – without breaking the buildbuild
#pragma #pragma deprecateddeprecated( ViperMarkVII )( ViperMarkVII )
#pragma #pragma deprecateddeprecated( ( ""BOOMERBOOMER"" ) )
____declspec(declspec(deprecateddeprecated) void Boxey(){}) void Boxey(){}
void Boxey(int) {}void Boxey(int) {}
OpenMP (/openmp)OpenMP (/openmp)Industry-standard extension to C++Industry-standard extension to C++
A quick and simple way of making A quick and simple way of making portions of your game multithreadedportions of your game multithreaded
Uses: particle systems, collision, Uses: particle systems, collision, sortingsorting
#pragma omp parallel for#pragma omp parallel for
for( int i = 0; i < numParticles; ++i )for( int i = 0; i < numParticles; ++i )
UpdateTyliumParticles ( particle[i] );UpdateTyliumParticles ( particle[i] );
Static Code Analysis Static Code Analysis (/analyze)(/analyze)Provides lint-like code checkingProvides lint-like code checking
Common buffer overrunsCommon buffer overruns
Dereferencing NULLDereferencing NULL
Memory leaksMemory leaks
Use of uninitialized memoryUse of uninitialized memory
Supports custom function annotationSupports custom function annotationSpecify callers must check return values, Specify callers must check return values, NULL pointers allowed, params are read or NULL pointers allowed, params are read or write or both, where the buffer size is write or both, where the buffer size is found, etc.found, etc.
void Fill([Pre(WritableElements="count")] int* p, int count);void Fill([Pre(WritableElements="count")] int* p, int count);
Safe CRTSafe CRT400 new safe CRT functions!400 new safe CRT functions!Detect buffer overruns, accessing Detect buffer overruns, accessing NULL ptrs, string formatting issues, NULL ptrs, string formatting issues, invalid flags and much, much moreinvalid flags and much, much more
char s[6];char s[6];// Unsafe// Unsafestrcpystrcpy(s, "Hello, Caprica");(s, "Hello, Caprica");// Still unsafe!!!// Still unsafe!!!strncpystrncpy(s, "Hello, Caprica", _countof(s));(s, "Hello, Caprica", _countof(s));// Safe, but tedious// Safe, but tediousstrncpystrncpy(s, "Hello, Caprica", _countof(s));(s, "Hello, Caprica", _countof(s));s[_countof(s)-1] = 0;s[_countof(s)-1] = 0;
Safe CRTSafe CRT400 new safe CRT functions!400 new safe CRT functions!Detect buffer overruns, accessing NULL Detect buffer overruns, accessing NULL ptrs, string formatting issues, invalid ptrs, string formatting issues, invalid flags and much, much moreflags and much, much more
char s[6];char s[6];// Three safe options:// Three safe options:strcpy_sstrcpy_s(s, _countof(s), "Hello, Caprica");(s, _countof(s), "Hello, Caprica");strcpy_sstrcpy_s(s, _countof(s), _TRUNCATE, "Hello, (s, _countof(s), _TRUNCATE, "Hello,
Caprica");Caprica");// Recommended (simplest)// Recommended (simplest)strcpy_sstrcpy_s(s, "Multiple DRADIS contacts!");(s, "Multiple DRADIS contacts!");
Safe CRT Safe CRT RecommendationsRecommendationsReplace insecure functions. Really.Replace insecure functions. Really.
Consider using std::string and friends.Consider using std::string and friends.
Insecure functions are automatically Insecure functions are automatically deprecated on Windows.deprecated on Windows.
Enable Enable _XBOX_CRT_DEPRECATE_INSECURE _XBOX_CRT_DEPRECATE_INSECURE to deprecate insecure funcs on Xbox to deprecate insecure funcs on Xbox 360.360.
Enable Enable _CRT_SECURE_NO_DEPRECATE to _CRT_SECURE_NO_DEPRECATE to temporarilytemporarily disable deprecation. disable deprecation.
Checked IteratorsChecked IteratorsMany debugging and security Many debugging and security enhancements; no code changes requiredenhancements; no code changes required
Detect invalid iterator ranges, iterators from Detect invalid iterator ranges, iterators from different containers, walking off container different containers, walking off container endend
_HAS_ITERATOR_DEBUGGING: enabled in _HAS_ITERATOR_DEBUGGING: enabled in debug builds; disabled in release buildsdebug builds; disabled in release builds
_SECURE_SCL: enabled in debug and release_SECURE_SCL: enabled in debug and release
vector<int> adar(12);vector<int> adar(12);
vector<int> roslin(vector<int> roslin(11););
copy( adar.begin(), adar.end(),copy( adar.begin(), adar.end(),
roslin.begin()); roslin.begin());
Throwing newThrowing newPrior to VS 2005, CRT Prior to VS 2005, CRT newnew function function returned NULL on failurereturned NULL on failure
Now Now newnew throws std::bad_alloc on throws std::bad_alloc on failurefailure
// new behavior: throws// new behavior: throws bad_allocbad_alloc
Baltar* pBaltar = new Baltar;Baltar* pBaltar = new Baltar;
// old behavior: returns NULL// old behavior: returns NULL
Kobol* pKobol = new(Kobol* pKobol = new(std::nothrowstd::nothrow) Kobol;) Kobol;
// Don't check for NULL before deleting// Don't check for NULL before deleting
// if pRaptor!=NULL not required (never was)// if pRaptor!=NULL not required (never was)
delete pRaptor;delete pRaptor;
Buffer Security CheckBuffer Security Check/GS compiler option provides run-time /GS compiler option provides run-time buffer overrun detectionbuffer overrun detection
Overhead is low on Windows and Overhead is low on Windows and Xbox, but significant on Xbox 360Xbox, but significant on Xbox 360
On Windows and Xbox, enable /GS for On Windows and Xbox, enable /GS for both debug and release buildsboth debug and release builds
On Xbox 360, disable /GS completelyOn Xbox 360, disable /GS completelyXbox 360 stack is non-executable, so /GS Xbox 360 stack is non-executable, so /GS has less valuehas less value
ISO C++ ComplianceISO C++ Compliance in in VS2005VS2005Over 98% compliant on Plum-Hall Over 98% compliant on Plum-Hall
suitesuite
Scopes for-loop variables properlyScopes for-loop variables properly
Compiles Boost, Loki, and other Compiles Boost, Loki, and other modern template librariesmodern template libraries
Non-compliant areasNon-compliant areasNo two-phase lookupNo two-phase lookup
No export keywordNo export keyword
No exception specificationsNo exception specifications
Exception SpecificationsException SpecificationsExamplesExamples
void Starbuck(); void Starbuck(); // throws anything// throws anything
void Apollo() void Apollo() throw(...)throw(...); ; // throws anything// throws anything
void Adama() void Adama() throw()throw(); ; // throws nothing// throws nothing
void Helo() void Helo() throw(X,Y)throw(X,Y); ; // throws only X or Y// throws only X or Y
IssuesIssuesNotNot compile-time guarantees compile-time guarantees
Ignored by the VS compiler (mostly)Ignored by the VS compiler (mostly)
VS-specific: throw() = VS-specific: throw() = __declspec(nothrow)__declspec(nothrow)
Excep Spec Excep Spec RecommendationsRecommendationsException specifications suck. Avoid Exception specifications suck. Avoid
them.them.
Use __declspec(nothrow) on functions Use __declspec(nothrow) on functions that can never throwthat can never throw
DtorsDtors
Trivial accessorsTrivial accessors
Swap routinesSwap routines
Never, ever throw from a nothrow or Never, ever throw from a nothrow or throw() functionthrow() function
C++ Exception HandlingC++ Exception HandlingUse RAII (whether you use CEH or not)Use RAII (whether you use CEH or not)
Use C++ EH for exceptional conditions Use C++ EH for exceptional conditions onlyonly
Never throw from dtors, swap(), Never throw from dtors, swap(), dealloc()dealloc()
Overall perf penalty ranges from 1–5%Overall perf penalty ranges from 1–5%Higher on Xbox/x86 than Xbox 360/x64Higher on Xbox/x86 than Xbox 360/x64
Consider disabling Consider disabling C++EHC++EH, esp. on , esp. on consolesconsoles
Off by default on Xbox 360 (and not Off by default on Xbox 360 (and not robust)robust)
Consider using C++EH in debug mode Consider using C++EH in debug mode onlyonly
Wrap throwing librariesWrap throwing libraries
The Move to 64-BitThe Move to 64-Bit64-bit game systems are a reality64-bit game systems are a reality
More are comingMore are comingVista will greatly increase the rampVista will greatly increase the ramp
Gaming/entertainment consumers want Gaming/entertainment consumers want x64x64
Access to much more memory (8 TB)Access to much more memory (8 TB)
Twice as many CPU registersTwice as many CPU registers
Generally not a difficult portGenerally not a difficult port
At least test your 32-bit game on 64-At least test your 32-bit game on 64-bitbit
64-Bit Programming 64-Bit Programming HurdlesHurdlesPointer arithmetic:Pointer arithmetic:
sizeof(long*) != sizeof(long)sizeof(long*) != sizeof(long)
Bad address calculations and Bad address calculations and incorrect data alignment or padding incorrect data alignment or padding assumptionsassumptions
Evil code examplesEvil code examples
char* pAck = char* pAck = ""By Your CommandBy Your Command"";;
longlong CylonPtr = (long)pAck; CylonPtr = (long)pAck;
struct Dualla { char c; int n; };struct Dualla { char c; int n; };
int int n = *(int*)(n = *(int*)( ((BYTEBYTE*)&dualla + *)&dualla + 44 ););
64-Bit Recommendations64-Bit RecommendationsSupport 32 Support 32 andand 64-bit games on 64-bit games on WindowsWindows
Avoid assumptions about pointer sizeAvoid assumptions about pointer size
Use INT_PTR and ULONG_PTR and Use INT_PTR and ULONG_PTR and friendsfriendsLONG_PTR CylonPtr = (LONG_PTR CylonPtr = (LONG_PTRLONG_PTR)pAck;)pAck;
Use size_t and ptrdiff_tUse size_t and ptrdiff_t
Avoid assumptions about struct Avoid assumptions about struct layout or paddinglayout or padding
Compile at the highest warning levelsCompile at the highest warning levelsFix “truncation” and “conversion” Fix “truncation” and “conversion” warningswarnings
Data Alignment IssuesData Alignment Issues32-bit Windows32-bit Windows
Unaligned access handled automatically; minor Unaligned access handled automatically; minor (sometimes major) performance impact(sometimes major) performance impact
64-bit Windows64-bit WindowsBe wary of extra space/padding in structures Be wary of extra space/padding in structures because of 64-bit pointersbecause of 64-bit pointers
Xbox 360Xbox 360int: perf penalty for access across 32-byte int: perf penalty for access across 32-byte boundaryboundaryfloat/double: must be at least 4-byte aligned; float/double: must be at least 4-byte aligned; perf penalty for access across 32-byte boundaryperf penalty for access across 32-byte boundaryVMX128: must use extra instructions for non-VMX128: must use extra instructions for non-16-byte aligned16-byte aligned
Unaligned means reads and writes are not-Unaligned means reads and writes are not-atomic, Interlocked not allowedatomic, Interlocked not allowed
Optimization CornucopiaOptimization CornucopiaMemoryMemory
TR1 (Technical Report 1) featuresTR1 (Technical Report 1) features
STL container tipsSTL container tips
STL iterator tipsSTL iterator tips
Fast vector mathFast vector math
Pointer aliasingPointer aliasing
Virtual functionsVirtual functions
Custom allocatorsCustom allocators
Memory Bandwidth Is Memory Bandwidth Is FiniteFiniteProverb: Thou shalt treat memory as Proverb: Thou shalt treat memory as
if it were thy hard driveif it were thy hard drive
You will be memory-bound on Xbox You will be memory-bound on Xbox 360 and future PCs360 and future PCs
RecommendationsRecommendationsBe cache-awareBe cache-aware
Use everything you readUse everything you readPlace uncommonly accessed data elsewherePlace uncommonly accessed data elsewhere
Don't mix hot and cold dataDon't mix hot and cold data
Avoid multiple passes over large data Avoid multiple passes over large data setssets
TR1 RecommendationsTR1 RecommendationsTR1 includes some useful new TR1 includes some useful new features features such as hashed containers and smart such as hashed containers and smart ptrsptrs
Available from boost.org and Available from boost.org and DinkumwareDinkumware
unordered_set/map faster than unordered_set/map faster than set/mapset/map
Only slower in worst case (many dupl. Only slower in worst case (many dupl. keys)keys)
Consider shared_ptrConsider shared_ptrThis smart pointer is well tested, This smart pointer is well tested, industry proven, high performance, and industry proven, high performance, and reference countedreference counted
STL Container TipsSTL Container TipsPrefer contiguous containersPrefer contiguous containers
tr1::array, vector, dequetr1::array, vector, deque
Avoid node-based containersAvoid node-based containerslist, map, set, multi_map/setlist, map, set, multi_map/set
Prefer tr1::unordered_set/map over Prefer tr1::unordered_set/map over set/mapset/map
Consider custom allocatorsConsider custom allocators
Vector + std::sort can be the ideal Vector + std::sort can be the ideal solutionsolution
STL Iterator TipsSTL Iterator TipsPass iterators by valuePass iterators by valueWrite functions to accept iterators, Write functions to accept iterators, not containersnot containersPrefer iterators to indicesPrefer iterators to indices
With some exceptions on Xbox 360 prior With some exceptions on Xbox 360 prior to April XDKto April XDK
Use pre-increment unless you need Use pre-increment unless you need the result (++i good, i++ bad)the result (++i good, i++ bad)With reverse iterators, understand With reverse iterators, understand the difference between ri and the difference between ri and ri.base()ri.base()
Mathematical VectorsMathematical Vectorsstruct vec { float w,x,y,z; } is evilstruct vec { float w,x,y,z; } is evil
vector<vector<float>vector<vector<float> > is Baltar-evil> is Baltar-evil
Use native vector types for best perfUse native vector types for best perf
Xbox 360Xbox 360Use __vector4 or XMVECTORUse __vector4 or XMVECTOR
Use Xboxmath.h routinesUse Xboxmath.h routines
WindowsWindowsPrefer __m128 Prefer __m128
Avoid D3DVECTOR for core game codeAvoid D3DVECTOR for core game code
Cross-Platform VectorsCross-Platform Vectors#if( _XBOX_VER == 200 ) // Xbox 360
#include "XboxMath.h"
typedef __vector4 XVec;
typedef XVec XVecParam;
#elif defined( _XBOX ) // original Xbox
#include "XgMath.h"
typedef XGVECTOR4 XVec;
typedef const XVec& XVecParam;
#elif defined( _WIN32 ) // Windows
#include "XmmIntrin.h"
typedef __m128 XVec;
typedef XVec XVecParam;
#endif
Example XVec FunctionExample XVec Functioninline XVec XVecAdd( XVecParam vA, XVecParam vB ){#if( _XBOX_VER == 200 ) // Xbox 360 return __vaddfp( vA, vB ); #elif defined( _XBOX ) // original Xbox XVec result; XGVec4Add( &result, &vA, &vB ); return result; #elif defined( _WIN32 ) // Windows return _mm_add_ps( vA, vB ); #endif}
Vectors in structs/classesVectors in structs/classesGiven Given struct BaseStar { __vector4 struct BaseStar { __vector4 v; };v; };, compiler won’t normally pass , compiler won’t normally pass BaseStarBaseStar in a register on Xbox 360 in a register on Xbox 360
Available now in the Xbox 360 XDK:Available now in the Xbox 360 XDK:
__declspec(__declspec(passinregpassinreg))
struct BaseStar { __vector4 v; };struct BaseStar { __vector4 v; };
Allows you to create efficient cross-Allows you to create efficient cross-platform vector classesplatform vector classes
Pointer AliasingPointer AliasingPointers are “aliased” if they could ever Pointers are “aliased” if they could ever point to the same chunk of memorypoint to the same chunk of memory
Aliasing prevents compiler optimizationsAliasing prevents compiler optimizations
void Add( float* pGaeta, const float* p ){void Add( float* pGaeta, const float* p ){
for( int i = 0; i < n; ++i )for( int i = 0; i < n; ++i )
pGaeta[i] = pGaeta[i] = *p*p + 1.0f; + 1.0f;
}}
Compiler must assume p could point into Compiler must assume p could point into pGaetapGaeta
Must reload p at every iteration. Ouch!Must reload p at every iteration. Ouch!
Aliasing RecommendationsAliasing RecommendationsUse __restrict to tell the compiler that Use __restrict to tell the compiler that there will be no aliasingthere will be no aliasing
Function pointer parametersFunction pointer parameters
Local pointersLocal pointers
Pointers in structs/classesPointers in structs/classes
LimitationsLimitationsDoesn’t work on references Doesn’t work on references
void Add( float* void Add( float* __restrict__restrict pGaeta, pGaeta,
const float* const float* __restrict__restrict p ) p )
Beware Virtual FunctionsBeware Virtual FunctionsVirtual functions are usefulVirtual functions are useful
They come with a costThey come with a costThis is particularly true on Xbox 360This is particularly true on Xbox 360
RecommendationsRecommendationsUse vfuncs with careUse vfuncs with care
Avoid vfuncs in hot codeAvoid vfuncs in hot code
Avoid vfuncs to distinguish between Avoid vfuncs to distinguish between platformsplatforms
Use profilers to examine calls. If all Use profilers to examine calls. If all targets identical, virtual is probably targets identical, virtual is probably unnecessaryunnecessary
Beware ConstructorsBeware ConstructorsCtors often dominate execution timeCtors often dominate execution time
Ctors are called everywhereCtors are called everywhereLocal objects and arraysLocal objects and arrays
Overloaded operatorsOverloaded operators
Adding objects to containersAdding objects to containers
Nameless temporariesNameless temporaries
Streamline ctor codeStreamline ctor codeExample: don’t clear needlesslyExample: don’t clear needlessly
Consider inlining small ctorsConsider inlining small ctors
Choose the Right Data Choose the Right Data TypesTypesPrefer 32- or 64-bit ints over 8- and Prefer 32- or 64-bit ints over 8- and
16-bit for locals and parameters16-bit for locals and parametersPrefer doubles at run time, floats for Prefer doubles at run time, floats for storagestorage
Note on Windows that D3D device init Note on Windows that D3D device init automatically disables double precisionautomatically disables double precisionNote that x86 sqrt CRT function runs Note that x86 sqrt CRT function runs slower when FPU set to float precisionslower when FPU set to float precisionx86 fsqrt instruction runs faster when x86 fsqrt instruction runs faster when FPU set to float precisionFPU set to float precision
Consider bitfields, FLOAT16, and Consider bitfields, FLOAT16, and other smaller formatsother smaller formats
Memory AllocationMemory AllocationMemory alloc/free routines are some Memory alloc/free routines are some of the most costly functions you can of the most costly functions you can callcall
Often imply synchronizationOften imply synchronization
Rule of thumb: No allocs in game loopRule of thumb: No allocs in game loop
RecommendationsRecommendationsUnderstand what functions allocateUnderstand what functions allocate
Hook XMemAlloc on Xbox and Xbox 360Hook XMemAlloc on Xbox and Xbox 360
Consider custom allocators—fixed size Consider custom allocators—fixed size allocators, per-thread allocators, etc.allocators, per-thread allocators, etc.
Custom AllocatorsCustom AllocatorsCommon allocator typesCommon allocator types
Fixed-sized allocators to reduce Fixed-sized allocators to reduce overheadoverhead
Pool allocators to avoid fragmentationPool allocators to avoid fragmentation
Per-container allocators for cache Per-container allocators for cache coherencycoherency
Per-thread allocators for cache Per-thread allocators for cache coherencycoherency
Writing STL allocators is non-trivialWriting STL allocators is non-trivialUse the default allocator in <memory> Use the default allocator in <memory> as your starting pointas your starting point
Useful info at the memory Useful info at the memory management talk from Gamefestmanagement talk from Gamefest
The Road to SublimityThe Road to SublimityBewareBeware EmbraceEmbrace
Exception Exception specificationsspecifications
Visual Studio 2005Visual Studio 2005
C++ exception C++ exception handlinghandling
TR1TR1
Virtual functionsVirtual functions Security featuresSecurity featuresPointer aliasingPointer aliasing RestrictRestrictvector<vector<float>vector<vector<float>
>>Native vector typesNative vector types
Node-based Node-based containerscontainers
64-bit64-bit
ConstructorsConstructors LocalizationLocalizationSo Say We AllSo Say We All
© 2007 © 2007 MicrosoftMicrosoft Corporation. All rights reserved. Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
http://www.xna.com