Download - Telecom Network Fault Prediction
1
Telecom Network Fault Prediction
H. K. Yuen
Department of Management Sciences
City University of Hong Kong
2
Outline
• Problem Formulation
• Variable Selection
• Model Development
• Model Implementation
3
Problem Formulation
• Overview– Messages about network performances are
generated from transmission stations– Messages are examined manually– Messages are classified as urgent fault or non-
urgent fault– To build a model to predict whether a received
signals an urgent fault or not
4
Problem Formulation
• The Data– 5,924 past messages were collected– Each message contains 1,082 variables – Each message was examine manually– The decision "Urgent" or "Non-Urgent" was set
as the target variable• Urgent case = "True" Non-Urgent case = "Null"
5
Problem Formulation
• Distribution of the Target Variable
Null True
6
Problem Formulation
• Selection of Cases– Use the Sampling node of Enterprise Miner
(EM) to select a sample
7
Variable Selection
• Using all of the variables in the model is not practical
• Impractical to examine the associations between the target variable and the other input variables manually
• The Tree node and the Variable Selection node of Enterprise Miner were employed
8
Variable Selection• Process flow
9
Variable Selection
• Some results from Tree1
• A total of 23 variables are selected as input
10
Model Development• Data are partitioned into three parts
– Training (50%)– Validation (25%)– Testing (25%)
• Two possible model selection criteria: – The one that most accurately predicts the
response (either "True" or "Null")– The one that generates the highest expected profit
11
Model Development• Modified Profit Vector
• Neural Network models with different setting were developed
• Model Output: Prob(Target variable="True")
12
Model Development
• Process flow
• Model Manager
13
Model Development
• How to choose a model with the most predictive power?– Sensitivity: # of predicted "True" / # actual "True"– Specificity: # of predicted "Null" / # actual "Null" – Cutoff point: Observations with predicted
probability of the target event greater than a cutoff point are classified as "True"
14
Model Development
• Receiver Operating Characteristic Chart (ROC)
Sen
siti
vity
1-Specificity
Cutoff
Higher
Lower
All "True"
All "Null"
Winner
15
Model Development• Correct Classification Chart
– Displays the prediction accuracy for each actual target level across a range of cutoff values
Cutoff
16
Implementing the Model
An incoming signal
with predicted Prob(target variable = "True) = p
Class 1
p 0.5
Send technician
Class 2
else
Examine the signal manually
Class 3
p 0.15
Ignore the signal
17
Implementing the Model• Results of classification
• Benefits:• Saving in manpower• Faster response time to problems
Class 1 Class 2 Class 3
True 50.38% 36.84% 12.78%Null 11.31% 25.68% 63.01%
Overall 12.19% 25.93% 61.88%
Actual