02 monitoring
TRANSCRIPT
-
8/3/2019 02 Monitoring
1/20
POSTECH DP&NM Lab
1
Network Monitoring
J. Won-Ki Hong
Dept. of Computer Science and Engineering
POSTECH
Tel: 054-279-2244Email: [email protected]
-
8/3/2019 02 Monitoring
2/20
POSTECH DP&NM Lab
2
Table of Contents
Introduction
Monitored Types of Information
Network Monitoring Configurations
Network Monitoring Methods
Performance Monitoring
Performance Indicators
Performance Monitoring Functions
Fault Monitoring
Problems of Fault Monitoring
Fault Monitoring Functions
Accounting Monitoring
-
8/3/2019 02 Monitoring
3/20
POSTECH DP&NM Lab
3
Introduction
Network monitoring is concerned with observingandanalyzingthe status and behavior of the end systems,
intermediate systems, and subnetworks that make up the
network to be managed
Issues in network monitoring
what to monitor?
define what is to be monitored
how to monitor?
how to obtain information from managed resources what to do with the monitored information?
how the monitored information is used in various management
functional areas
-
8/3/2019 02 Monitoring
4/20
POSTECH DP&NM Lab
4
Monitored Types of Information
Static information
hardly changes
current configuration information
e.g., the number and identification of ports on a router
Dynamic information
changes frequently
information related to events in the network
e.g., change of state, transmission/reception of packets
Statistical information
derived from dynamic information
e.g., average number of packets transmitted per unit time
-
8/3/2019 02 Monitoring
5/20
POSTECH DP&NM Lab
5
Organization of a Management Information Base
MANAGEMENT INFORMATION BASE (MIB)
Call_Blocked Packet_Loss
Time_Delay Throughput
State_Variable
Event_Variable
Switch_serverBuffer Source
Server
Station_Info
Switch_Buffer
Switch_Source
Status_Sensor
Derived_Status_Sensor
Event_Sensor
Configuration data base
Sensor data base
Statistical
data base
Dynamic
data base
Abstraction of state
and event variables
Sensor activation and
data collection
Static data base
-
8/3/2019 02 Monitoring
6/20
POSTECH DP&NM Lab
6
Monitoring System Components
monitoring application
includes the functions of monitoring that are visible to the user
e.g., performance, fault, accounting
manager function
performs the basic monitoring function of retrieving information agent function
gathers and records management information for one or more
network elements and delivers the information to the monitor
managed objects
mgmt information that represents resources and their activities
monitoring agent
generates summaries and statistical analysis of mgmt information
-
8/3/2019 02 Monitoring
7/20
POSTECH DP&NM Lab
7
Functional Architecture for Network
Monitoring
Monitoringapplication
Monitoring
application
Manager
function
Manager
function
Monitoring
agent
...Agentfunction
Agentfunction
Agentfunction
Managed
objects
Managed
objects
Managed
objects
(a) manager-agent model (b)A model for summarization
-
8/3/2019 02 Monitoring
8/20
POSTECH DP&NM Lab
8
Network Monitoring Configurations
LAN
(c) External monitor
Monitoring
application
Monitoring
application
Monitoring
application
Manager
function
Manager
function
Manager
function
Agent
function
Agent
function
Agent
function
Managed
objects
LAN
(a) Managed resources in
manager system
(d) proxy monitor agentobserved traffic
Subnetwork
or internet
Monitoringapplication
Manager
function
Agent
function
Managed
objects
(b) Resources in agent system
Subnetwork
or internet
Subnetwork
or internet
-
8/3/2019 02 Monitoring
9/20
POSTECH DP&NM Lab
9
Network Monitoring Methods
P
olling a request-response interaction between a manager and agent
a manager sends request to an agent which processes the
request and responds with information from its MIB
a manager may use polling to
learn about the configuration it is managing obtain periodically an update of conditions
investigate an area in detail after being altered to a problem
Event Reporting
information flow is initiated from the agent to manager
an agent may generate report periodically to give the manager its
current status or whenever a significant event (e.g., change of a
state) or an unusual event (e.g., fault) occurs
good for detecting problems as soon as they occur
-
8/3/2019 02 Monitoring
10/20
POSTECH DP&NM Lab
10
Performance Monitoring
Measuring the performance of the network (orperformance monitoring) is absolutely required in NM
to detect & fix problems that cause performance degradation
to better plan network upgrades
Problems in selecting and using appropriate indicators (or
metrics)
too many indicators in use
the meaning of most indicators are not yet clearly understood
some indicators are supported by some manufacturers only
frequently, the indicators are accurately measured but incorrectly
interpreted by human or mgmt application
the calculation of indicators takes too much time
-
8/3/2019 02 Monitoring
11/20
POSTECH DP&NM Lab
11
Network Performance Indicators
Service-oriented Availability: the percentage of time that a network system, a
component, or an application is available for a user
Response Time: how long it takes for a response to appear at
a users terminal after a user action calls for it Accuracy: the percentage of time that no errors in the
transmission and delivery of information
Efficiency-oriented Throughput: the rate at which application-oriented events (e.g.,
file transfers) occur
Utilization: the percentage of the theoretical capacity of aresource (e.g., transmission line, switch, CPU) that is being used
-
8/3/2019 02 Monitoring
12/20
POSTECH DP&NM Lab
12
Elements of Response Time
TO
Workstation
Network interface
(e.g., router) ServerSI
SO
TI
WI WO
CPU
RT = TI + WI + SI + CPU + WO + SO + TO
RT = response time CPU = CPUprocess delay
TI = inbound terminal delay WO = outbound queuing time
WI = inbound queuing time SO = outbound service time
SI = inbound service time TO = outbound terminal delay
Network
-
8/3/2019 02 Monitoring
13/20
POSTECH DP&NM Lab
13
Performance Monitoring Functions
PerformanceMeasurement the actual gathering of statistics about network traffic & timing
typically performed by agents within network devices
e.g., amount of data in and out of a node, number of connections,
traffic per connection
PerformanceAnalysis
analyzing the gathered data and presenting it
e.g., total, average, min, max, histogram
Synthetic Traffic Generation
generating artificial traffic load
permits the network to be observed under a controlled load
-
8/3/2019 02 Monitoring
14/20
POSTECH DP&NM Lab
14
Typical Performance-Related Questions
Performance measurements can be used to answer a
number of questions
Why is the response so slow? (a very loaded question!)
Why is the retransmission rate so high?
Is traffic evenly distributed among network users or are theresource-destination pairs with unusually heavy traffic?
What is thepercentage ofeach type ofpacket?
What is the channel utilization and throughput?
What is theeffect oftraffic load on utilization, throughput &
timedelays? When does traffic load start to degrade system performance?
What is the maximum capacity ofthe channel under normal
operating conditions? How many active users are necessary
to reach this maximum?
-
8/3/2019 02 Monitoring
15/20
POSTECH DP&NM Lab
15
Fault Monitoring
To detect faults as quickly as possible after they occurand to identify the cause of the fault so that correctional
action may be taken
Problems of Fault Monitoring
F
ault DetectionP
roblems Unobservable faults: e.g., deadlock, device not monitorable
Partially observable faults: insufficient to pinpoint the problem
Uncertainty in observation: not clear what the problem is
F
ault IsolationP
roblems Multiple potential causes
Too many related observations
Interference between diagnosis and local recovery procedures
Absence of automated testing tools
-
8/3/2019 02 Monitoring
16/20
POSTECH DP&NM Lab
16
What happens when the T1 link fails?
Client Server
802.5
Router
MUX MUX
PBX PBX
Router
802.3
802.3
T1
HeterogeneousNetwork
Environment
-
8/3/2019 02 Monitoring
17/20
POSTECH DP&NM Lab
17
Propagation of Failures to Higher Layers
ClientServer
Router Router
Mux Mux
Application failure
Transport failure
Data link failure
Transmission
break
-
8/3/2019 02 Monitoring
18/20
POSTECH DP&NM Lab
18
Fault Monitoring Functions
Logging record important events and errors
logs should be accessible by managers (e.g., via polling)
Event Reporting
sending events, errors to managers
sending alarms to manager to warn possible problems
Diagnostic Functions connectivity test (e.g., traceroute)
response-time test
liveness test (e.g., ping)
protocol integrity test
loopback test
-
8/3/2019 02 Monitoring
19/20
POSTECH DP&NM Lab
19
Accounting Monitoring
Keeping track of users usage of network resources
communication facilities
computer hardware
software and systems
services
Usage may need to be broken down by account, by
project, or by individual user for appropriate accounting
purposes
-
8/3/2019 02 Monitoring
20/20
POSTECH DP&NM Lab
20
Summary
Network monitoring is the most basic aspect of NM The purpose of network monitoring is to gather
information about the status and behavior of network
elements
Information to be gathered include static, dynamic and statistical information
Monitoring methods - polling & event reporting
Monitoring functions
performance monitoring fault monitoring
accounting monitoring
READ Chapter 2 of Textbook