exadata performance troubleshooting methodology what is exadata? • first and foremost exadata...

Download Exadata Performance Troubleshooting Methodology What is Exadata? • First and foremost Exadata is a

Post on 12-Mar-2020

11 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Exadata Performance Troubleshooting Methodology

    James Viscusi Consulting Member of Technical Staff

    Andrew Bulloch Architect

    Server Technologies - Maximum Availability Architecture Team

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Safe Harbor Statement

    THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY, AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISIONS. THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT THE SOLE DISCRETION OF ORACLE.

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    The questions

    How do I monitor my Exadata environment? What parameters are most important?

    What thresholds do I set to monitor my Exadata using Enterprise Manager?

    How do I diagnose a performance problem involving Exadata?

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Agenda

    Level Setting – Exadata

    What to do before problems occur?

    What do we do when problems occur?

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Exadata Architecture

    1

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What is Exadata? • First and foremost Exadata is a platform to run Oracle databases in a

    highly available and performant manner

    • The hardware and software stack are tightly integrated. The components are tested by Oracle and work together, making the solution extremely performant.

    • Every generation of Exadata is designated as X2, X3, X4, etc. The current naming standard is iterative and increases one number each hardware release. Available on Intel of SPARC chipsets

    • The second part of the name is either a -2 or -8. These indicate the number of sockets on each compute node

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What makes up an Exadata Database Machine?

    • Storage/Cell Servers

    • Compute/Database Servers

    • Infiniband Switches

    • Ethernet Switch

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What do we do before problems occur? 


    • Configure Enterprise Manager metric extensions • Understand Key Performance Indicators (KPIs) for Exadata • View KPIs using Systems and Services in Enterprise Manager • Configure Adaptive Thresholds in EM 13 for Exadata KPIs

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What are Metrics in Enterprise Manager? • A metric is a stored piece of information used to monitor a target

    Type of Metrics – Metrics can be information collected by the EM Agent – Derived from information stored in the repository

    • Metric Extensions – Metrics that are custom defined by users.

    • Can be server side or repository side

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Metrics and Thresholds • Enterprise Manager has a comprehensive set of metrics that allow

    thresholds to be defined on all target types. – Thresholds Allow for alerting if a chosen metric crosses a certain value

    • Server ( Compute Node) Metrics – monitored as any other host target (memory, i/o , CPU, network )

    • Cell Server Metrics – Creates incidents on all alerts received from the cell(SNMP Based)

    • Database Metrics – Database Time Spent Waiting, Throughput, Efficiency

    – One problem – Enterprise Manager monitors many metrics!

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Key Performance Indicators • What Is a KPI?

    – A quantifiable measurement used to determine server health or performance

    • Defined a set of KPIs – Compute Nodes – Storage Servers – Infiniband Switches

    • KPIs are defined and explained in: – http://www.oracle.com/technetwork/database/availability/exadata-storage-server-

    kpis-2855362.pdf – Also reference MOS Note 2094648.1

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Compute and Infiniband KPIs

    Compute Nodes

    • CPU Utilization • Memory Utilization • Load Average • Swap Utilization

    Infiniband Switches

    • CPU Usage • Memory Percent Used • Root filesystem usage • SSH Session Count

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Storage Server Key Performance Indicators • Use Metric Extensions to create compound

    metrics

    • KPIs for a storage Storage Server aggregate read and write data – Create Metric Extensions (again in MOS 2094648.1)

    • Disk IOPS • Disk Throughput • Response Time • I/O Load • Cell Health

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Enterprise Manager Services • Metric Extensions with Services allows a holistic view of the

    storage grid – incidents will be created whenever warning or critical thresholds are

    crossed

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    However… (another often asked question) • Using thresholds in Enterprise Manager allows users to be alerted in

    the event metrics show an issue – i.e CPU usages exceeds a specified amount

    • KPIs do not have universal values. They can differ depending on many things – Customer Requirements – Environment Usage

    • Defining one set of thresholds that works for every customer/ environment is not feasible

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Adaptive Thresholds (new in EM 13) • Use the collected metrics to make a data driven recommendation

    for each specific system – Analyze the data over a 1-4 week window

    • Not all metrics are eligible (but the ones we need are!) • Two methods of collecting the data from the paper

    – Dynamic – Guided

    • Companion Paper to the KPI paper – http://www.oracle.com/technetwork/database/availability/exadata-

    adaptive-thresholds-3102556.pdf

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Customizing Adaptive Threshold collections

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Adaptive Threshold Data analytics

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Adaptive Threshold Final Setting

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    AWR Baselines

    • Collection of snapshots used for performance comparisons.

    • Baselines are retained within the AWR even after the retention time for the data has been reached.

    • Exadata should have a moving and a static baseline in place to capture different workloads.

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What to do when problems occur?

    • Review

    • Rule Out Hardware

    • Compare

    • Drill down

    2

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Checklist can be very useful!!

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Rule Out- Hardware

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    
 
 Check the Obvious 


    DB Machine Home Page • Contains a lot of good information at a quick glance

    • Incident Manager • Alert Logs • Grid Infrastructure • ASM • Databases

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Incident Manager

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Incident Manager(Contd.) – maybe drop this slide?

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Compare

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    What Changed? If there is a problem what has changed? And who might know?

    Considerations • Patch levels (everywhere!) • Schema • Tunable OS parameters • Resource Management Plans • Code Changes • ADDM Comparison Report

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Compare Configurations- Exadata Level

  • Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

    Compare Configuration- Database Level

    • EM Job to compare one ‘reference’ database against one or more other databases

    • Job can be scheduled on a repetitive basis, or run ad-hoc

  • Copyright © 2014 Oracle and/or its affil

Recommended

View more >