modeling tools for cmp research

Upload: udit-kumar

Post on 07-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Modeling Tools for CMP Research

    1/46

    Simics and FriendsSimics and Friends

    Modelin Tools for CMP ResearchModelin Tools for CMP Research

    ZvikaZvikaZvikaZvikaZvikaZvikaZvikaZvika GuzGuzGuzGuzGuzGuzGuzGuz,,,,,,,, IsaskharIsaskharIsaskharIsaskharIsaskharIsaskharIsaskharIsaskhar ((((((((ZigiZigiZigiZigiZigiZigiZigiZigi) Walter) Walter) Walter) Walter) Walter) Walter) Walter) Walter

    The TechnionThe Technion Israel Institute of TechnologyIsrael Institute of Technology

  • 8/6/2019 Modeling Tools for CMP Research

    2/46

    AgendaAgenda

    Review the most commonly used tools in CMP arch research

    Simulators

    Benchmarks

    Official AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial AgendaOfficial Agenda

    2

    Convince you to use Simics

    Because most often than not it is the best option

    Because we need more (geographically adjacent) people

    Unofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial AgendaUnofficial Agenda

    Teaching the tools

    Not on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our AgendaNot on Our Agenda

  • 8/6/2019 Modeling Tools for CMP Research

    3/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary Technion goodies

    3

  • 8/6/2019 Modeling Tools for CMP Research

    4/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary Technion goodies

    4

  • 8/6/2019 Modeling Tools for CMP Research

    5/46

    Choosing A SimulatorChoosing A Simulator

    Performance

    Design

    Space

    Performance

    Design

    Space

    Ease Of Use

    What should it model?

    Processor /Cache/Interconnect/etc.

    What would run on it?

    Benchmarks type

    5

    e a ex ye

    a ex y

  • 8/6/2019 Modeling Tools for CMP Research

    6/46

  • 8/6/2019 Modeling Tools for CMP Research

    7/46

    Choosing a Simulator for CMP ResearchChoosing a Simulator for CMP Research

    What will it model?

    Multiple cores

    Memory hierarchy (caches, coherence)

    Interconnect (NoC)

    a w run on

    Multi-threaded benchmarks

    Commercial workloads

    7

    Need OS for that

    Really need OS for that

    Full-system simulator, capable of booting (commercial) OS

  • 8/6/2019 Modeling Tools for CMP Research

    8/46

  • 8/6/2019 Modeling Tools for CMP Research

    9/46

    Meet the ContendersMeet the Contenders

    SimpleScalar

    Uniprocessor

    PIN

    Not a simulator

    Several in-house tools

    Not relevant

    M5

    Simics

    ?

    9

  • 8/6/2019 Modeling Tools for CMP Research

    10/46

    Why Simics?Why Simics? (the short answer)(the short answer)

    Because everyone is using it

    THE most widely used simulator in our field

    1/3 of ISCA07 papers used Simics

    Huge, active community

    Alive and kicking forum

    Because it is free

    For academia

    Up to Simics 4.2

    ..Oh.. and because it is really really good!

    10

  • 8/6/2019 Modeling Tools for CMP Research

    11/46

    SimicsSimics in a Nutshellin a Nutshell

    Virtual Hardware

    Event driven

    Cycle accurate*

    Completeroduction

    The software cant

    tell the difference

    Runs binaries from

    real target

    O eratin s stem

    User program

    MiddlewareDBJava VM

    get

    Software

    11

    HW/SW

    interface

    software

    Simulated(virtual)

    hardware

    Virtual Hardware

    CPU

    RAM

    FLASH

    User Intf

    device

    A/DROM

    PCI

    I2C

    Bus

    CPU

    NetworkDisk

    Disk Ctrl

    Drivers Boot firmwareHardware

    abstraction layer

    Ta

    http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    12/46

    Simics Overview (Simics Overview (11//33))

    A software, event-driven simulator

    Full-system simulator

    Processor

    Simics is a flexible, scalable, and high-performance full-system simulator

    Memory hierarchy (DRAM, Disk)

    Network

    Devices (DMA, Interrupt controller, PCI, etc.)

    Runs unmodified binaries

    OS, drivers and applications

    Models the entire machine that OS sees

    Application cannot tell the difference

    12http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    13/46

    Simics Overview (2/3)Simics Overview (2/3)

    Fully supported ISAs:

    SPARC

    X86

    Simics is a flexible, scalable, and high-performance full-system simulator

    Alpha, Itanium, MIPS, ARM, ..

    Scalable:

    Single processor (uniprocessor /CMP) MPs Racks Clusters

    Distributed systems

    13http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    14/46

    Simics Overview (Simics Overview (33//33))

    Flexible

    Different degrees of simulation (details)

    Functionality only

    Simics is a flexible, scalable, and high-performance full-system simulator

    Microarchitecture and timing

    Configurable

    Hook/unhook modules

    Control their timing

    Write your own (in C++)

    14http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    15/46

    DemoDemo

    Solaris/PowerPCSolaris/PowerPC

    RedHat 7.2/Itanuim

    NT/x86

    RedHat 6.2/

    x86

    15

    RedHat 7.2/ Pentium III

    XP/x86-64

    RedHat 7.2/ Pentium III

    Simics console

    XP/x86-64Solaris 8/UltraSparc II

    Simics console

    http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    16/46

    What Have We Seen?What Have We Seen?

    User application code

    Middleware and libraries

    16

    SimicsSimics

    Host hardwareHost hardware

    Host operating systemHost operating system

    Virtual target hardware

    Target operating system (s)

    http://www.virtutech.com/

  • 8/6/2019 Modeling Tools for CMP Research

    17/46

    Simics Provides:Simics Provides:

    Checkpoints

    Save/restore state

    Breakpoints

    Temporal breakpoints

    rea on memory eg ster

    Graphics breakpoint

    Magic instructions

    Signal Simics from within your application

    Access host files from the simulated machine

    So much more..

    17

  • 8/6/2019 Modeling Tools for CMP Research

    18/46

    Simics Timing ModelsSimics Timing Models

    Default mode

    Every instruction takes exactly 1 clock cycle

    Including access to disc, access to memory, etc.

    in-order mode

    10X-100X

    slowdown

    when memory request occurs

    Function returns the number of cycles to stall

    Out-of-order mode (MAI mode)

    Detailed out-of-order arch simulation

    User-defined processor model

    Full control on how instructions advance

    18

    1000X-10000X

    slowdown

    10000X-1 million

    slowdown

  • 8/6/2019 Modeling Tools for CMP Research

    19/46

    Simics TimingSimics Timing -- defaultdefault

    Emulation mode

    Used for fast-forwarding

    Boot OS

    Build workload

    ast- orwar to re evant execut on part

    Basically, used for creating a checkpoint

    19

  • 8/6/2019 Modeling Tools for CMP Research

    20/46

    Simics TimingSimics Timing in orderin order

    Timing model is a C program

    You can act on every memory access

    Usually used for modeling:

    Caches (and cache hierarchies)

    Coherency protocols (directory)

    Hardware/Hybrid transactional memory

    20

  • 8/6/2019 Modeling Tools for CMP Research

    21/46

    Simics TimingSimics Timing Out Of Order ModeOut Of Order Mode

    Gives full control over timing

    User decides when things happen

    Fetch/decode/execute/commit

    Simics handle how these things happen

    Out-of-order execution, multi-processor, multi-threading,

    branch prediction, value prediction

    Used for processor arch research

    Models processor internal

    And whenever you need a better notation of time

    Interconnect study

    21

  • 8/6/2019 Modeling Tools for CMP Research

    22/46

    Simple ExampleSimple Example Adding Cache (Adding Cache (11//44))

    Nahalal A new cache architecture for CMP

    Architectural differentiation of cache lines at runtime

    According to usage -Private vs. Shared

    22

    CPU0

    CPU1

    CPU2

    CPU6CPU5

    CPU4

    CPU3CPU7

    CPU0

    CPU1

    CPU2

    CPU6CPU5

    CPU4

    CPU3CPU7

  • 8/6/2019 Modeling Tools for CMP Research

    23/46

    Simple ExampleSimple Example Adding Cache (Adding Cache (22//44))

    1. Writing a cache timing model

    C- Program

    23

  • 8/6/2019 Modeling Tools for CMP Research

    24/46

    Simple ExampleSimple Example Adding Cache (Adding Cache (33//44))

    2. Hooking the new cache into Simics

    Python script

    24

  • 8/6/2019 Modeling Tools for CMP Research

    25/46

    Simple ExampleSimple Example Adding Cache (Adding Cache (44//44))

    3. Run Simics and collect statistics

    25

  • 8/6/2019 Modeling Tools for CMP Research

    26/46

    Simics in ResearchSimics in Research Virtual Hierarchies, M. R. Marty and M. D. Hill, Micro's Top Picks 2008

    Improving Multiple-CMP Systems Using Token Coherence,, M. R. Marty, J. D.

    Bingham, M. D. Hill, A. J. Hu, M.K. Martin and D. A. Wood, HPCA 2005

    "Nahalal: Cache Organization for Chip Multiprocessors", Z. Guz, I. Keidar, A.

    Kolodny, U. C. Weiser, IEEE Computer Architecture Letters, May 2007

    Memory Mapped ECC: Low-Cost Error Protection forLast Level Caches, D. H.. ,

    TokenTM: Efficient Execution of Large Transactions with Hardware Transactional

    Memory, J. Bobba, N. Goyal, M. D. Hill, M. M. Swift, and D. A. Wood, ISCA 2008

    Predicting the Performance of Reconfigurable Optical Interconnects in Distributed

    Shared-Memory Systems, W. Heirman, J. Dambre, I. Artundo, C. Debaes, H.Thienpont, D. Stroobandt, J. Van Campenhout, Photonic Network Communications 08

    Serializing Instructions in System-Intensive Workloads: Amdahl's Law Strikes Again

    P. M. Wells, G. S. Sohi, HPCA 2008

    PredictorVirtualization, I. Burcea, S. Somogyi, A. Moshovos and B. Falsafi,

    ASPLOS 2008

    26

  • 8/6/2019 Modeling Tools for CMP Research

    27/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary

    Technion goodies

    27

  • 8/6/2019 Modeling Tools for CMP Research

    28/46

    AddAdd--ons for Simicsons for Simics

    Open-source add-ons enlarge Simics capabilities

    Some as popular as Simics itself

    Garnet & Orion

    SimFlex

    FeS2

    28

  • 8/6/2019 Modeling Tools for CMP Research

    29/46

    MultifacetMultifacet GEMSGEMS

    The most mature Simics add-on

    Most of ISCAs Simics papers actually use GEMS

    Alive and active forum

    GEMS is a set of modules for Virtutech Simics that enables

    detailed simulation of multiprocessor systems, including CMP.

    Two main components

    Ruby Memory system timing simulator

    Opal Timing model for OOO processor

    Flexible

    Can be configured/altered/hacked

    Add your own models

    29http://www.cs.wisc.edu/gems/

  • 8/6/2019 Modeling Tools for CMP Research

    30/46

    GEMS RubyGEMS Ruby

    Cache hierarchy

    L1, L2 (private/shared), SNUCA/DNUCA, Simple DRAM

    Different coherence protocols

    Snoop, Directory, Token coherence

    HW transaction memory

    Log-TM

    Suns Rock

    Interconnect

    Simple

    Garnet - detailed NoC interconnect

    30http://www.cs.wisc.edu/gems/

  • 8/6/2019 Modeling Tools for CMP Research

    31/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary

    Technion goodies

    31

  • 8/6/2019 Modeling Tools for CMP Research

    32/46

    L2$ L2$ L2$ L2$

    L2$ L2$ L2$ L2$

    CPU

    L1$

    CPU

    L1$

    CPU

    L1$

    CPU

    L1$

    CMP is More than CPUs and MemoryCMP is More than CPUs and Memory

    We need to model the interconnect too

    Might have a paramount effect on performance and power

    Sometime, this is all we need!

    L2$ L2$ L2$ L2$

    L2$ L2$ L2$ L2$CPU

    L1$

    CPU

    L1$

    CPU

    L1$

    CPU

    L1$

    32

  • 8/6/2019 Modeling Tools for CMP Research

    33/46

    Important part of the system!

    Static modeling can account for static attributes

    Topology, routing, link bandwidth, packet size, etc.

    Run-time effects are much harder to (statically) model

    Simulate the Interconnect? Why Bother?Simulate the Interconnect? Why Bother?

    Shared resource arbitration, finite buffer sizes,

    channel multiplexing, flow control,

    Might be dominating factors

    Driving home during rush hours

    33

  • 8/6/2019 Modeling Tools for CMP Research

    34/46

    NoC is a network!

    Use a network oriented tool with built in support for traffic modeling

    Eliminate complex system simulator if not really needed

    Perfect tool for optimizing the interconnect

    Network vs. Full System SimulatorNetwork vs. Full System Simulator

    rc ec ure, opo ogy, pro oco s, parame er un ng, e c.

    Easy programming and debugging

    Fast!

    Fastest discrete event simulation engine among leading industry

    solutions

    34

  • 8/6/2019 Modeling Tools for CMP Research

    35/46

    OPNET Modeler FeaturesOPNET Modeler Features

    Object-oriented modeling

    Hierarchical modeling environment

    GUI-based debugging and analysis

    Event-driven simulation engine

    Coding C/C++ & auxiliary functions

    Open interface for integrating external object files, libraries, and

    other simulators

    Asynchronous/synchronous modeling

    35

  • 8/6/2019 Modeling Tools for CMP Research

    36/46

    "QNoC: QoS architecture and design process for Network on Chip, E. Bolotin, I.

    Cidon, R. Ginosar, A. Kolodny, Special issue on Networks on Chip, The Journal of

    Systems Architecture, December 2003

    "Network Delays and Link Capacities in Application-Specific Wormhole NoCs, Z.

    Guz, I. Walter, E. Bolotin, I. Cidon, R. Ginosar, and A. Kolodny, VLSI Design, vol.

    2007, Article ID 90941, May 2007

    OPNET in CMP ResearchOPNET in CMP Research

    "Routing Table Minimization for Irregular Mesh NoCs, E. Bolotin, I. Cidon, R.

    Ginosar, A. Kolodny, DATE 2007

    "Access Regulation to Hot-Modules in Wormhole NoCs, I. Walter, I. Cidon, R.

    Ginosar, A. Kolodny, NOCS 2007

    "The Power of Priority: NoC based Distributed Cache Coherency, E. Bolotin, Z.Guz, I. Cidon, R. Ginosar, A. Kolodny, NOCS 2007

    "Best of Both Worlds: A Bus Enhanced NoC (BENoC), R. Manevich, I. Walter, I.

    Cidon, and A. Kolodny, the ACM/IEEE Int. Symp. on Networks-on-Chip (NOCS),

    2009

    36

  • 8/6/2019 Modeling Tools for CMP Research

    37/46

    A new interconnect architecture, utilizing the best of both worlds

    Use NoC for data delivery

    Use bus for lightweight, latency critical meta-data

    Coherency

    BusBus--Enhanced Network onEnhanced Network on--ChipChip

    R

    R

    R

    R

    R R

    R

    RR R R

    RR R R

    R

    Module

    Module

    Module

    Module

    Module

    Module

    Module

    Module

    ModuleModule Module Module

    ModuleModule Module Module

    37

  • 8/6/2019 Modeling Tools for CMP Research

    38/46

    BusBus--Enhanced Network onEnhanced Network on--ChipChip

    R

    R

    R

    R

    R R

    R

    RR R R

    RR R R

    R

    Module

    Module

    Module

    Module

    Module

    Module

    Module

    Module

    ModuleModule Module Module

    ModuleModule Module Module

    38

  • 8/6/2019 Modeling Tools for CMP Research

    39/46

    Run OPNET as a trace-driven simulator

    L2 access logs generated by Simics

    Advantages

    Fast

    Gluing OPNET toGluing OPNET to SimicsSimics

    mp e

    Disadvantage

    Dependencies are lost

    Does not account for latency hiding techniques (e.g. OOO)

    But..

    OPNET can be glued to Simics using Ruby

    39

  • 8/6/2019 Modeling Tools for CMP Research

    40/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary

    Technion goodies

    40

  • 8/6/2019 Modeling Tools for CMP Research

    41/46

    Meet the ContendersMeet the Contenders

    CPU2006, CPU2000

    OMP2001

    JBB2005, JBB2000

    SPLASH-2

    PARSEC

    Commercial workloads

    Apache

    Databases

    ?

    ?

    41

  • 8/6/2019 Modeling Tools for CMP Research

    42/46

    Benchmark ComparisonBenchmark Comparison

    CPU

    2006

    OMP

    2001

    SPLASH-

    2

    PARSEC Commercial

    Programs 29 11 14 13 1

    Multi-Threaded

    42

    verse

    Updated

    Emerging apps

    Installation ease

    Simulation friendly

  • 8/6/2019 Modeling Tools for CMP Research

    43/46

    The PARSC Benchmark SuiteThe PARSC Benchmark Suite

    Over 1000 downloads since release

    This is what everyone will be using

    43http://parsec.cs.princeton.edu/

  • 8/6/2019 Modeling Tools for CMP Research

    44/46

    OutlineOutline

    Choosing a Simulators

    Simics

    And friends

    GEMS, Garnet & Orion , FeS2, SimFlex

    OPNET - modeling CMP interconnect

    Benchmarks

    Summary

    Technion goodies

    44

  • 8/6/2019 Modeling Tools for CMP Research

    45/46

    Technion GoodiesTechnion Goodies http://www.ee.technion.ac.il/matrics/software.html

    Simics workload kits

    Ease up installation of simics workloads

    Wisconsin GEMS provide few other too

    Constantly adding more workloads to the pool

    Can you help?

    OPNET models for NoC

    Our entire QNoC model for OPNET

    Cores, router and links, SNUCA/DNUCA L2 caches

    Routing schemes, arbitration policies, resource contention

    Synthetic/trace driven simulation

    Transactified version of Apache

    45

  • 8/6/2019 Modeling Tools for CMP Research

    46/46

    SummarySummary

    A swift overview of simulation tools for CMP

    Simics

    GEMS

    OPNET

    Technions two cents

    46

    Questions?

    [email protected]