qos evaluation model for a campus-wide network: an alternative approach juan antonio martínez...
TRANSCRIPT
QoS Evaluation Model for a Campus-Wide Network:an alternative approach
Juan Antonio Martínez (Juan Antonio Martínez ([email protected]@uab.es))
Comunicacions - Servei d’InformàticaComunicacions - Servei d’Informàtica
Universitat Autònoma de BarcelonaUniversitat Autònoma de Barcelona
Index
ObjectivesObjectives Classical approach to QoSClassical approach to QoS Evaluating network availability: a generic modelEvaluating network availability: a generic model Extending the model to the servicesExtending the model to the services Practical resultsPractical results Comparison to other existing products and future Comparison to other existing products and future
workwork ConclusionsConclusions
Objectives
Determine if our network reaches the ‘5 9s’ Determine if our network reaches the ‘5 9s’ objective (99.999% of availability)objective (99.999% of availability)
Obtain a generic model to evaluate the quality Obtain a generic model to evaluate the quality of the network regardingof the network regarding InfrastructureInfrastructure ServicesServices
The model must The model must Be simpleBe simple Adapt easily to network topology changesAdapt easily to network topology changes
Obtaining the model
The classical approach
Quality is evaluated as a combination of:Quality is evaluated as a combination of: DelayDelay JitterJitter Packet lossPacket loss
This approach is useful for WAN links and environmentsThis approach is useful for WAN links and environments Easy to measure in both router endpointsEasy to measure in both router endpoints
In our opinion, it is not suitable for LAN environmentsIn our opinion, it is not suitable for LAN environments Parameters are difficult to determine (distributed Parameters are difficult to determine (distributed
environment)environment) Gathered data is not significant (burst traffic)Gathered data is not significant (burst traffic)
Problems of this model
Quality is not assigned a numeric valueQuality is not assigned a numeric value In a LAN environmentIn a LAN environment
Network probes must be distributed through the Network probes must be distributed through the networknetwork
In switched environments, each segment In switched environments, each segment provides a different valueprovides a different value
Expensive to implementExpensive to implement Fluctuant values due to the traffic natureFluctuant values due to the traffic nature
In fact, there is not a model. It is just an evaluation In fact, there is not a model. It is just an evaluation of the aforesaid parametersof the aforesaid parameters
Our model:the basis
Whichever the system is, it must: Whichever the system is, it must: Be of low cost.Be of low cost. Provide QoS as a numerical value.Provide QoS as a numerical value. Be flexible enough to adapt itself to network Be flexible enough to adapt itself to network
topology changes. topology changes. Our idea is to get data from :Our idea is to get data from :
Simple tools like ‘ping’Simple tools like ‘ping’ SNMP queriesSNMP queries
We discard complex solutions (modified TCP stacks, We discard complex solutions (modified TCP stacks, proprietary PING...)proprietary PING...)
Conceptual development of the model
Assume that all network devices are known and Assume that all network devices are known and with SNMP management facilities. with SNMP management facilities.
Assume that the number of users affected by any Assume that the number of users affected by any network device failure can be determined.network device failure can be determined.
The idea beyond the process isThe idea beyond the process is Choose the critical network devices Choose the critical network devices Weight them accordinglyWeight them accordingly Determine the instant availabilityDetermine the instant availability Compute the mean valueCompute the mean value
Choosing the devices and the weights for the model
3x100
2X100
MODEM 2M
IIIAGW
BIBHUM1SW1
SWITCH-LLET
DRETSWITANCCSWITANCCSWIT01
C7P1SW1
3x10
BibHum.
FundacióUAB
CSud
SI
CNord
CCentre
IBF
M
Dispensari
CNM
IIIA
CVC
CRM
BNord
BCentre
BSud
Pl.Cívica
Doct
G5 G6
FTI
CED
C.I
A
Vet
V i la
Arqueologia
Taller
IFAE
CORE BUILDERSI0SWFO1SI0SWFO2
GW
iemb
Estabulari
EscolaTurisme
M0SWIT
ICM
1X10
H o te lC a m p u s
hottwp3
HOTEL
10
CAMPUSUAB
15-05-2000
10
10
2 x 1 0 0
10
10
10
10
TÈRMICA
10
SAFPiscina
SAFPoliesportiu
IFAEGW
A N E L L A C I E N T Í F IC A34M atm
THICK10
1006x100
10
THICK
10
10
1 x 1 0
10
1X10 1X100
10
1X100
1X100
1X10
1X100
2x100
10
10
2x10
10
MODEM 2M
100
100
FTISW
RECSWIT
100 VETSWIT
10 TP
10
10
3x10
hottwp1
TP10
cibibfo
Xalet
Mathematical analysis (I)
LetLet Kd be the relative availability coefficient Kd be the relative availability coefficient di be a binary value that tells whether a di be a binary value that tells whether a
segment is accessible or notsegment is accessible or not
For the mean value this impliesFor the mean value this implies
Mathematical analysis (and II)
If we sample at constant intervalsIf we sample at constant intervals
For efficient computing, this meansFor efficient computing, this means
In this way, we can evaluate the availability with In this way, we can evaluate the availability with the number of samples, the previous mean value the number of samples, the previous mean value and the last sample. and the last sample.
Extending the model to the services
Our first goal is to determine whether the Our first goal is to determine whether the service is working or not (ok/not ok)service is working or not (ok/not ok)
The explained network model is suitable for the The explained network model is suitable for the services with few changes :services with few changes : Omission of the criticity values (Kc=1)Omission of the criticity values (Kc=1) A tool to determine instant availability is A tool to determine instant availability is
needed (ping is no longer valid)needed (ping is no longer valid)
Details regarding practical implementation
Network availability
Availability is computed based on ICMP tests (Availability is computed based on ICMP tests (pingping)) Second-level granularitySecond-level granularity To compute the coefficientsTo compute the coefficients
Kc1 : from the network topology.Kc1 : from the network topology. Kc2 : from our experience. Kc2 : from our experience.
The global coefficient is computed as the arithmetic The global coefficient is computed as the arithmetic mean valuemean value
A config file stores the network devices that will be A config file stores the network devices that will be testedtested
config file example
[NETWORK]#ip_name ip_address availability coef. gw 158.109.0.3 0.203125CB 158.109.0.26 0.213541667si0swfo1+si0swfo2 158.109.2.233 0.052340183c7p1sw1 158.109.8.236 0.051626712anccswit 158.109.8.235 0.056078767m0swit 158.109.20.214 0.045519406cvcsw 158.109.4.200 0.034218037cibibfo 158.109.29.208 0.036843607hottwp1+hottwp3 158.109.31.200 0.0355879dretswit 158.109.25.201 0.050313927ecllefo 158.109.25.220 0.048658676bhum1sw1 158.109.184.203 0.042323059ftisw 158.109.21.202 0.03678653recswit 158.109.27.222 0.052368721vetswit 158.109.30.201 0.040667808
Service availability
The model is essentially the same, but now for each The model is essentially the same, but now for each machine we analyse its critical services machine we analyse its critical services
The system canThe system can Evaluate proper function of a service at a given Evaluate proper function of a service at a given
time time Compute the availability over timeCompute the availability over time
A proprietary MIB is used to determine the critical A proprietary MIB is used to determine the critical parameters of each service (SNMP queries are parameters of each service (SNMP queries are supported) supported)
Monitoring requirements
A set of ‘C’ programs (one for each service)A set of ‘C’ programs (one for each service) A global configuration file:A global configuration file:
[SERVICES]#host name services to be tested cc.uab.es smtp pop imapnews.uab.es nntpftp.uab.es ftp...
Optionally, for the services : A proprietary MIB with the data we want to
monitor (not mandatory) A modified version of the snmpd daemon
Practical Results in our Campus Network
Environment
Development :Development : PC Pentium II 300 + Linux RedHat 6.0PC Pentium II 300 + Linux RedHat 6.0 gcc 2.91.66gcc 2.91.66
Production: Production: Ultra Enterprise 450 + Solaris 2.6 (gcc 2.95.2)Ultra Enterprise 450 + Solaris 2.6 (gcc 2.95.2) PC Pentium II 400 + Linux RedHat 6.0 (gcc PC Pentium II 400 + Linux RedHat 6.0 (gcc
2.91.66)2.91.66) Our network availability has achieved monthly values Our network availability has achieved monthly values
between 0.99843 and 1.0between 0.99843 and 1.0 For the services, we use it both for availability values For the services, we use it both for availability values
and to test that they are working properly. and to test that they are working properly.
Overview of the system
A set of routines that verify that the services are A set of routines that verify that the services are working properly:working properly: bootp,dhcp,dns,ftp,http,smtp,pop,imap,radius,nntpbootp,dhcp,dns,ftp,http,smtp,pop,imap,radius,nntp
A C program that implements the mathematical modelA C program that implements the mathematical model Configuration file, which includes Configuration file, which includes
Coefficients for the modelCoefficients for the model Servers and services to be monitored. Servers and services to be monitored.
Optionally, a modified version of the Optionally, a modified version of the snmpd snmpd daemon if daemon if access to the proprietary MIB is desired. access to the proprietary MIB is desired.
Measured availability data
Network Availability Data
98,000%
98,500%
99,000%
99,500%
100,000%
Avail.Data 99,916 99,916 99,916 99,932 99,929 99,940 99,844 99,912 98,636 99,993 99,970 99,985
Abr00 Mai00 Jun00 Jul00 Ago00 Set00 Oct00 Nov00 Dec00 Gen01 Feb01 Mar01
‘on line’ service monitoring
Based on the programs that are used to check Based on the programs that are used to check service availabilityservice availability
Useful to network operators to detect Useful to network operators to detect network problemsnetwork problems
Operating modes :Operating modes : Interactive: provides a report with Interactive: provides a report with
configurable debugconfigurable debug Cron-based : generates mail/sms messages Cron-based : generates mail/sms messages
if any problems are detectedif any problems are detected
Interactive execution example*** 14:35:06.308612 blues.uab.es:smtp ...220 blues Sendmail 5.65v4.0 (1.1.19.2/17Dec99-1023AM) Tue, 7 Nov 2000 13:44:20 +QUIT221 blues closing connection*** 14:35:06.351857 blues.uab.es:smtp 116b,0.0432s,2.6Kb/s OK!*** 14:35:06.360239 blues.uab.es:pop3 ...+OK POP3 blues v7.63 server readyQUIT+OK Sayonara*** 14:35:06.512647 blues.uab.es:pop3 49b,0.152s,0.31Kb/s OK!*** 14:35:06.521040 blues.uab.es:imap ...* OK blues IMAP4rev1 v12.261 server readyA01 LOGOUT* BYE blues IMAP4rev1 server terminating connectionA01 OK LOGOUT completed*** 14:35:06.622567 blues.uab.es:imap 121b,0.102s,1.2Kb/s OK!*** 14:35:06.631185 news.uab.es:nntp ...200 news.cesca.es InterNetNews NNRP server INN 2.3.0 ready (posting ok).QUIT205 .*** 14:35:06.884128 news.uab.es:nntp 81b,0.253s,0.31Kb/s OK!*** 14:35:06.892686 ftp.uab.es:ftp ...220 koala FTP server (Version wu-2.5.0(1) Wed Oct 20 12:02:15 DST 1999) ready.
Comparison to other existing products and future work
Comparison
As a monitoring tool, there are other existing As a monitoring tool, there are other existing products, which are really powerfulproducts, which are really powerful Big BrotherBig Brother NoCoLNoCoL
Regarding the graphic tools, they generate Regarding the graphic tools, they generate no data, but simply display it.no data, but simply display it.
None of these gives a detailed model for the None of these gives a detailed model for the network availability or monthly reportsnetwork availability or monthly reports
Future work
Analysis and integration of multiple metersAnalysis and integration of multiple meters Improvement of the Web interfaceImprovement of the Web interface WAP interface to the monitoring systemWAP interface to the monitoring system
Conclusions
Benefits of the system
Proper working of all servers and services is easy and Proper working of all servers and services is easy and centrally verifiedcentrally verified
Easy adaptation to network changes Easy adaptation to network changes Network devicesNetwork devices New servers or servicesNew servers or services
Monthly reports of both internal and external Monthly reports of both internal and external connectivityconnectivity
Availability reports of the relevant servicesAvailability reports of the relevant services Easy to integrate with graphic tools such as MRTGEasy to integrate with graphic tools such as MRTG
For more information...
This presentation:This presentation: ftp://ftp.uab.es/pub/terena/QoS.pptftp://ftp.uab.es/pub/terena/QoS.ppt
The paper can be found at:The paper can be found at: ftp://ftp.uab.es/pub/terena/QoS.pdfftp://ftp.uab.es/pub/terena/QoS.pdf
Doubts, comments, suggestions...