02 monitoring

Upload: lino-alleje-cruz-iii

Post on 06-Apr-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 02 Monitoring

    1/20

    POSTECH DP&NM Lab

    1

    Network Monitoring

    J. Won-Ki Hong

    Dept. of Computer Science and Engineering

    POSTECH

    Tel: 054-279-2244Email: [email protected]

  • 8/3/2019 02 Monitoring

    2/20

    POSTECH DP&NM Lab

    2

    Table of Contents

    Introduction

    Monitored Types of Information

    Network Monitoring Configurations

    Network Monitoring Methods

    Performance Monitoring

    Performance Indicators

    Performance Monitoring Functions

    Fault Monitoring

    Problems of Fault Monitoring

    Fault Monitoring Functions

    Accounting Monitoring

  • 8/3/2019 02 Monitoring

    3/20

    POSTECH DP&NM Lab

    3

    Introduction

    Network monitoring is concerned with observingandanalyzingthe status and behavior of the end systems,

    intermediate systems, and subnetworks that make up the

    network to be managed

    Issues in network monitoring

    what to monitor?

    define what is to be monitored

    how to monitor?

    how to obtain information from managed resources what to do with the monitored information?

    how the monitored information is used in various management

    functional areas

  • 8/3/2019 02 Monitoring

    4/20

    POSTECH DP&NM Lab

    4

    Monitored Types of Information

    Static information

    hardly changes

    current configuration information

    e.g., the number and identification of ports on a router

    Dynamic information

    changes frequently

    information related to events in the network

    e.g., change of state, transmission/reception of packets

    Statistical information

    derived from dynamic information

    e.g., average number of packets transmitted per unit time

  • 8/3/2019 02 Monitoring

    5/20

    POSTECH DP&NM Lab

    5

    Organization of a Management Information Base

    MANAGEMENT INFORMATION BASE (MIB)

    Call_Blocked Packet_Loss

    Time_Delay Throughput

    State_Variable

    Event_Variable

    Switch_serverBuffer Source

    Server

    Station_Info

    Switch_Buffer

    Switch_Source

    Status_Sensor

    Derived_Status_Sensor

    Event_Sensor

    Configuration data base

    Sensor data base

    Statistical

    data base

    Dynamic

    data base

    Abstraction of state

    and event variables

    Sensor activation and

    data collection

    Static data base

  • 8/3/2019 02 Monitoring

    6/20

    POSTECH DP&NM Lab

    6

    Monitoring System Components

    monitoring application

    includes the functions of monitoring that are visible to the user

    e.g., performance, fault, accounting

    manager function

    performs the basic monitoring function of retrieving information agent function

    gathers and records management information for one or more

    network elements and delivers the information to the monitor

    managed objects

    mgmt information that represents resources and their activities

    monitoring agent

    generates summaries and statistical analysis of mgmt information

  • 8/3/2019 02 Monitoring

    7/20

    POSTECH DP&NM Lab

    7

    Functional Architecture for Network

    Monitoring

    Monitoringapplication

    Monitoring

    application

    Manager

    function

    Manager

    function

    Monitoring

    agent

    ...Agentfunction

    Agentfunction

    Agentfunction

    Managed

    objects

    Managed

    objects

    Managed

    objects

    (a) manager-agent model (b)A model for summarization

  • 8/3/2019 02 Monitoring

    8/20

    POSTECH DP&NM Lab

    8

    Network Monitoring Configurations

    LAN

    (c) External monitor

    Monitoring

    application

    Monitoring

    application

    Monitoring

    application

    Manager

    function

    Manager

    function

    Manager

    function

    Agent

    function

    Agent

    function

    Agent

    function

    Managed

    objects

    LAN

    (a) Managed resources in

    manager system

    (d) proxy monitor agentobserved traffic

    Subnetwork

    or internet

    Monitoringapplication

    Manager

    function

    Agent

    function

    Managed

    objects

    (b) Resources in agent system

    Subnetwork

    or internet

    Subnetwork

    or internet

  • 8/3/2019 02 Monitoring

    9/20

    POSTECH DP&NM Lab

    9

    Network Monitoring Methods

    P

    olling a request-response interaction between a manager and agent

    a manager sends request to an agent which processes the

    request and responds with information from its MIB

    a manager may use polling to

    learn about the configuration it is managing obtain periodically an update of conditions

    investigate an area in detail after being altered to a problem

    Event Reporting

    information flow is initiated from the agent to manager

    an agent may generate report periodically to give the manager its

    current status or whenever a significant event (e.g., change of a

    state) or an unusual event (e.g., fault) occurs

    good for detecting problems as soon as they occur

  • 8/3/2019 02 Monitoring

    10/20

    POSTECH DP&NM Lab

    10

    Performance Monitoring

    Measuring the performance of the network (orperformance monitoring) is absolutely required in NM

    to detect & fix problems that cause performance degradation

    to better plan network upgrades

    Problems in selecting and using appropriate indicators (or

    metrics)

    too many indicators in use

    the meaning of most indicators are not yet clearly understood

    some indicators are supported by some manufacturers only

    frequently, the indicators are accurately measured but incorrectly

    interpreted by human or mgmt application

    the calculation of indicators takes too much time

  • 8/3/2019 02 Monitoring

    11/20

    POSTECH DP&NM Lab

    11

    Network Performance Indicators

    Service-oriented Availability: the percentage of time that a network system, a

    component, or an application is available for a user

    Response Time: how long it takes for a response to appear at

    a users terminal after a user action calls for it Accuracy: the percentage of time that no errors in the

    transmission and delivery of information

    Efficiency-oriented Throughput: the rate at which application-oriented events (e.g.,

    file transfers) occur

    Utilization: the percentage of the theoretical capacity of aresource (e.g., transmission line, switch, CPU) that is being used

  • 8/3/2019 02 Monitoring

    12/20

    POSTECH DP&NM Lab

    12

    Elements of Response Time

    TO

    Workstation

    Network interface

    (e.g., router) ServerSI

    SO

    TI

    WI WO

    CPU

    RT = TI + WI + SI + CPU + WO + SO + TO

    RT = response time CPU = CPUprocess delay

    TI = inbound terminal delay WO = outbound queuing time

    WI = inbound queuing time SO = outbound service time

    SI = inbound service time TO = outbound terminal delay

    Network

  • 8/3/2019 02 Monitoring

    13/20

    POSTECH DP&NM Lab

    13

    Performance Monitoring Functions

    PerformanceMeasurement the actual gathering of statistics about network traffic & timing

    typically performed by agents within network devices

    e.g., amount of data in and out of a node, number of connections,

    traffic per connection

    PerformanceAnalysis

    analyzing the gathered data and presenting it

    e.g., total, average, min, max, histogram

    Synthetic Traffic Generation

    generating artificial traffic load

    permits the network to be observed under a controlled load

  • 8/3/2019 02 Monitoring

    14/20

    POSTECH DP&NM Lab

    14

    Typical Performance-Related Questions

    Performance measurements can be used to answer a

    number of questions

    Why is the response so slow? (a very loaded question!)

    Why is the retransmission rate so high?

    Is traffic evenly distributed among network users or are theresource-destination pairs with unusually heavy traffic?

    What is thepercentage ofeach type ofpacket?

    What is the channel utilization and throughput?

    What is theeffect oftraffic load on utilization, throughput &

    timedelays? When does traffic load start to degrade system performance?

    What is the maximum capacity ofthe channel under normal

    operating conditions? How many active users are necessary

    to reach this maximum?

  • 8/3/2019 02 Monitoring

    15/20

    POSTECH DP&NM Lab

    15

    Fault Monitoring

    To detect faults as quickly as possible after they occurand to identify the cause of the fault so that correctional

    action may be taken

    Problems of Fault Monitoring

    F

    ault DetectionP

    roblems Unobservable faults: e.g., deadlock, device not monitorable

    Partially observable faults: insufficient to pinpoint the problem

    Uncertainty in observation: not clear what the problem is

    F

    ault IsolationP

    roblems Multiple potential causes

    Too many related observations

    Interference between diagnosis and local recovery procedures

    Absence of automated testing tools

  • 8/3/2019 02 Monitoring

    16/20

    POSTECH DP&NM Lab

    16

    What happens when the T1 link fails?

    Client Server

    802.5

    Router

    MUX MUX

    PBX PBX

    Router

    802.3

    802.3

    T1

    HeterogeneousNetwork

    Environment

  • 8/3/2019 02 Monitoring

    17/20

    POSTECH DP&NM Lab

    17

    Propagation of Failures to Higher Layers

    ClientServer

    Router Router

    Mux Mux

    Application failure

    Transport failure

    Data link failure

    Transmission

    break

  • 8/3/2019 02 Monitoring

    18/20

    POSTECH DP&NM Lab

    18

    Fault Monitoring Functions

    Logging record important events and errors

    logs should be accessible by managers (e.g., via polling)

    Event Reporting

    sending events, errors to managers

    sending alarms to manager to warn possible problems

    Diagnostic Functions connectivity test (e.g., traceroute)

    response-time test

    liveness test (e.g., ping)

    protocol integrity test

    loopback test

  • 8/3/2019 02 Monitoring

    19/20

    POSTECH DP&NM Lab

    19

    Accounting Monitoring

    Keeping track of users usage of network resources

    communication facilities

    computer hardware

    software and systems

    services

    Usage may need to be broken down by account, by

    project, or by individual user for appropriate accounting

    purposes

  • 8/3/2019 02 Monitoring

    20/20

    POSTECH DP&NM Lab

    20

    Summary

    Network monitoring is the most basic aspect of NM The purpose of network monitoring is to gather

    information about the status and behavior of network

    elements

    Information to be gathered include static, dynamic and statistical information

    Monitoring methods - polling & event reporting

    Monitoring functions

    performance monitoring fault monitoring

    accounting monitoring

    READ Chapter 2 of Textbook