proposed work 1. client-server synchronization proposed work 2
TRANSCRIPT
Proposed Work
1
Client-Server Synchronization
Proposed Work
2
Implement Synchronization Framework
• Implement multi-streaming basics
• Separate streams
• Events tagged by virtual clock
• Begin optimization of specific streams
• Partial-knowledge prediction
• Generalize as we go
• Software framework?
3
Implement Building Selection Synchronization
• First target: decouple from dependencies
• Most important: pre-compute far ahead
• All clients informed of future selections
• Selection may depend upon past object position
• Secondary: adjust selection criteria
• More deterministic: compensation less likely
• Proximity a factor, but eliminate hard-edge cases
4
Object Transform Stream
• Common Verlet integrator functor tested
• Dynamically test float-point drift
• Compensation could ramp up for some systems
• Test most efficient way to deal with collisions
• Transmit collision info (including narrow misses)
• Pre-transmit transform updates at critical points
5
Significance
• Bandwidth is a scalability issue
• Scalable City’s complex environment
• Bandwidth is a cost issue
• Mobile systems
• Latency tolerance is a cost issue
• Normally addressed with more local servers
• Assumes dividing up customers is an option
6
Physics System
Proposed Work
7
Host Broad Phase Path
• Broad phase and CLEngine is ready!
• Implement active-subset maintenance code
• Leverages existing subsystems for performance
• Analyze performance characteristics
• How many players before bp becomes liability?
• Expect very good results, but results must be evaluated holistically with communication in the picture
8
OpenCL Broad Phase Path
• CL Hash Grid with additional query stage
• Traditional “gather results” challenge now applies
• Comm. plumbing for state maintenance
• Host code to manage very large objects
• Optional: Evaluate a state-aware CLEngine
• Adds a level of indirection, but reduces work
9
Hybrid Broad Phase
• Each system may exhibit valuable trade-offs
• Host BP: Lower power most of the time?
• CL BP: Higher maximum performance?
• Dynamically switching systems possible
• Best of all worlds: based upon cross-over point
• Only if performance data indicates benefit
10
Optimizations
• Many optimizations undone
• Communication (Transport Kernel, more)
• Computation (Rewrite stick constraint solver, etc)
• OpenCL (local memory, vector types)
• Expect much more speed-up
• Will address once complete system is up
• Attack highest overheads first
11
Distributed Layer
• Perform object migration, ghosting via MPI
• Or extend our custom boost::asio/ser. system
• Allows to scale beyond single memory space
• Apply to clusters, server farms
• No longer limited to bandwidth of a few channels
• Test very large scale systems
12
Significance
• Highest performance physics system
• Most power-efficient system
• Usable for large VR environments, games
• Low-power mobile device support w/ OpenCL
• Likely to be available soon
13
Tracking Heap Growth
Proposed Work
14
Diagnosing Unbounded Heap Growth in C++Proposed Work
• Implement multi-threaded sampling• Stack tracing for insertion operations
– Identify points of growth for aggregates• Reduce overhead of temp aggregates
– Detect stack-based aggregates– Minimum lifetime before tracking begins– Allow stack-tracing with low perform. impact
15
Diagnosing Unbounded Heap Growth in C++ Proposed Work
• Child aggregates problem (e.g., Linear Hashing buckets)– Children arrive, grow, and are then replaced
• Google Chrome / Chromium results– Continuously-growing false positives– Parent is tumor, children being detected
16
2 3 4 5 6 70
100
200
300
400
500
600
700
Tumors
Diagnosing Unbounded Heap Growth in C++ Proposed Work
• All children have precisely the same type tag– Can use a “unique” filter post-processing– Not ideal: can cause you to miss tumors!
• Problem: no concept of aggregate relationships
17
2 3 4 5 6 70
5
10
15
20
25
30
35
Tumors
False Positives
False Negatives
Diagnosing Unbounded Heap Growth in C++Proposed Work
• Identifying aggregate relationships– Detect size holistically– Performance challenges
• Increased automation– Detect custom data structures
• Automatically create wrappers for them• Presently user must identify & create wrapper
– Perform code transformations automatically• Solutions not complete worked out
– Both may require parser (e.g. clang)18
Significance
• Long-running applications need dynamic analysis
• Our experience indicates that most software suffers from these problems
• Allowed Scalable City to run for months
• Mobile applications: less memory, no VM
• Lowers cost of bug discovery significantly
• Stable software is good for everyone!
19