phone calls connections relevance to churn in mobile networks

Upload: mihailo-vuk-marinkovic

Post on 04-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    1/6

    PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN

    MOBILE NETWORKS

    Niko GAMULIN1, dr. Mitja TULAR1, dr. Sao TOMAI2

    1Telekom Slovenije, d.d., Cigaletova 15, 1000 Ljubljana2Faculty of Electrical Engineering, University of Ljubljana, Traka 25, 1000 Ljubljana

    [email protected]

    Abstract:As the telecommunications market has

    reached the mature stage, the majority of population in

    developed areas has already adopted mobile services

    and there are many competitors on the market, churn

    prediction has become critical for companies in order

    to retain their market shares. Analyzing the data that

    telecommunications service providers store, originally

    for billing purposes, it is possible to observe their users

    in the context of social network and gain additional

    insights about the spread of influence, relevant to

    churn.In this paper, we examine the communication

    patterns of mobile phone users and subscription plan

    logs. Our primary goal is to discover whether it is

    possible to determine which users are more likely to

    churn upon observing their outgoing calls and churn

    among their neighbors (friends).

    Keywords: churn, social network analysis, machine

    learning

    1. INTRODUCTION

    In order to attract new service consumers and retain theexisting ones, telecommunications service providers

    have been constantly forming new subscription

    adapting the terminal equipment offer according to

    actual trends and improving the quality of services by

    upgrading network equipment. Along with thedevelopment of machine learning methods and their

    efficiency the service providers from all industries have

    become aware of importance of the data which can be

    used to gain additional insights about their service

    consumers and consecutively target the important

    customers, more prone to churn. In the past, there havealready been many methods proposed for churn

    prediction using the past data. In some of these, the

    user is treated as an individual, independent from his

    acquaintances data while in the others network effects

    have been considered as well.

    The main question that motivated our research is

    whether it is possible to spot a spread of behavior, i.e.

    churn among connected users solely from observing

    the strength of call connections among them. In order

    to construct a social network, we have observed the

    Call Detail Record (CDR) data along the subscription

    plan log to determine the users subscription state in

    the observed time period. Each user represented a node

    and the aggregated outgoing calls towards each

    neighbor represented a directed edge, weighted with

    the number of calls and the sum of duration of all calls.

    The observed users subscription state along with his

    neighbors subscription state served as an indicator ofspread of behavior. As it is dynamic process, the time

    variable is also important. If the time period, that has

    elapsed between two acts of a same kind performed by

    two connected users, is large, it might not be

    appropriate to state that the second actor followed the

    first one and that the same state of two connected users

    is not a mere coincidence. On the other hand, if the

    observing time period is too short, we might not notice

    that the two connected users performed the same act

    after some additional time elapsed. In order to check

    whether the observing time period is relevant, we

    observed the users states over different time period

    lengths.

    As we have anticipated that the individual subscriber's

    choice about churn has been partially motivated by

    prior churners among his acquaintances, we have tried

    to determine the acquaintances impact on churndecision by observing the number of phone calls

    established, the duration of phone calls and the number

    of prior churners among the observed subscriber

    acquaintances. In order to prove the relevance between

    churn decision, and the number of prior churners

    among acquaintances along with the connections

    strength, measured by the relative number and duration

    of phone calls, we have observed the users in 3D space,

    defined by the axes that represented the relative

    number of prior churners among acquaintances and the

    number and duration of calls established with theserelative to all acquaintances. In order to prove the

    relevance between observed subscriber's behavior and

    prior acts of their neighbors we have calculated and

    plotted the lift curve to show the significant influence

    from prior acts, made by acquaintances for different

    time period lengths.

    2. PROBLEM STATEMENT

    Although the awareness of the importance of socialnetworks has increased significantly along with spread

    of online social networks, such as Facebook, Twitter,

    Google+ and LinkedIn, the majority of service

    providers from non-internet industries haven't exploited

    the potential of real social networks of interconnected

    people, who influence each other in real world. While

    several online services and retailers have already

    22

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    2/6

    developed marketing campaigns, based on social

    networks, the majority of businesses from other

    industries still try to attract new customers and keep the

    existing ones by threating each of them as individual

    and mostly invest in broadcast marketing campaigns

    and form offers on global level. Some of service

    providers and retailers are certainly not able to treattheir customers as interconnected peers, influenced bytheir friends due to lack of data, needed to represent

    users as nodes and form edges between them. On the

    other hand telecommunications service providers have

    to keep the data for their customers phoneconversations for charging purposes. These same data

    could be used to form a social network ifinterconnected users and anticipate how they influence

    each other.

    Our motivation for this research was based on the

    assumption that users influence each other over phone

    conversations and the power of influence is conditional

    on the duration and number of conversation. Accordingto this assumption, the user whose many friends have

    churned is also more likely to churn. Furthermore, we

    guessed that if the observed user is influenced by his

    peers, he follows them in the shorter amount of time.

    In the following chapter, there is an overview of someof the methods, previously proposed, for churn

    prediction where users are observed in social network

    context.

    3. EXISTING SOLUTIONSIn the past, there have been numerous studies

    performed, dealing with churn problem in

    telecommunications services sector. When dealing withchurn problem, at first one has to be aware of the bigdifference between prepaid and postpaid users. The

    first ones are, as opposed to second ones, not bound by

    a contract. In case of prepaid users, it is easier to

    observe users in social network context and extract the

    rules for the diffusion of churn as these users are free to

    make a decision about the change of service plan

    anytime. The model where prepaid users are observed

    in the context of social network is presented in[1]. On

    the other hand, while it is easier to observe prepaid

    users in the context of social network, it is not trivial to

    determine the users churn status as such users dont

    explicitly cancel the subscription plan. In[2], a modelfor prepaid user labeling is proposed along with churn

    prediction technique where users are observed as

    individuals, without influence of interconnected users.

    Dierkes et al. [3] observe if user churn decision of

    individuals in previous time periods have an impact onother users whom the target customer interacted with

    either via voice call, short message service (SMS), or

    multimedia message service (MMS) using Markov

    Logic Networks (MLNs).

    4. THE PROPOSED SOLUTIONThe main motivation for this research was to prove that

    users, connected among each other with phone callsand to determine the importance of strength of

    connections for the churn spread. Although there are

    many factors that influence users decision about

    subscription plan change, such as service price,

    marketing campaigns and special offers from

    competitor providers, we wanted to prove that the

    social factors itself plays an important role and for

    5. DATASET

    For our analysis, we used anonymised historical datafor about 790.000 users and about 42.000.000

    aggregated daily call connections records from CDR

    for September and October 2010. The call connection

    record contained the number of calls between callerand called person and the sum of calls durations for the

    observed day. Along with the call connections data we

    had available the churn log for the time period from

    year 2005 to year 2011 from which it was possible to

    label each observed user either as churner or non-

    churner.In order to perform the experiment, the original data

    had to be reshaped the following way. At first, the list

    of all active postpaid users was made from aggregateddaily call records, i.e. all postpaid callers were selected.

    Once having the list, we defined three different

    observation period lengths: 60 days, 30 days and 15

    days. For each period length we looped through the list

    of active users and for each one checked the total

    number of called neighbors, the number of all outgoing

    calls and the duration sum of all outgoing calls. Then,

    if user has churned in the period of observed CDR dataperiod, he was labeled as churner, otherwise as non-

    churner. Then, according to the observed period length,

    all of his neighbors that churned before him, inside the

    defined period length, were counted and the duration

    and number of outgoing calls were summed. Eachneighbor that churned outside the defined period lengthor after observed user was labeled as non-churner and

    the call data were treated as non-churners, i.e. the

    relative number of calls and duration for non-churners

    was increased. Having these data, the relative values

    were calculated for the number of neighbors that

    churned before, number of outgoing calls and the sum

    of duration of outgoing calls to users that churned

    before. After reshaping the data, we performed the

    experiment, described in the following section.

    6. EXPERIMENT

    Each record from the reshaped data can be representedas pair of input vector of independent variables and

    output dependent target class variable. In our case, theindependent variables were the number of neighbors

    that churned before, relative to the number of all

    neighbors, the number of calls and the sum of duration

    of all calls to neighbors that churned before, relative to

    all neighbors and the target variable was the class that

    represented whether the observed user has churned ornot. As the maximum length of input vector is 3 and

    the output class variable has 2 possible states, it is

    possible, for simpler visual interpretation, to represent

    the observed users as colored points, scattered in 3D

    space (Picture 1). The main aim of representingobserved users in 3D space was to gain intuition for

    further analysis; the reduction of observed variables on

    one and can potentially reduce the quality of results,

    23

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    3/6

    while, on the other hand in the domain of large

    numbers of input variables the result might be difficult

    to interpret.

    Picture 1 - Sample set of users from the

    observed dataset, represented as points

    in space. Blue points represent non-

    churners while red points represent

    churners. The sphere with centre point

    (1,1,1) represents a classification shape;

    all users inside are classified as

    churners.

    From visual representation it was possible to draw the

    following intuitive conclusions. If many of observed

    user's neighbors churn and if one spends relatively long

    time talking with churners, the probability that the

    observed user will churn is higher than in case the

    observed users don't have many neighbors who

    churned.Certainly, visual observation might lead tofalse conclusion and therefore we decided to draw a

    sphere and observe, how many users of each class

    (churners and non-churners) where captured inside or

    outside the sphere, circle or region (depending on the

    number of observed independent variables and base

    point) with varying range from different base points.

    With 3 dimensions, represented with independent input

    variables, it is possible to observe users churn state

    depending on either single input variable or

    combinations of 2 or all variable values. To select a

    segment of observed users, we first set a base point andthen increased the observed area range from minimum

    value, 0, to maximum value. For all possible

    combinations of input variable, we first set the starting

    point to the origin of the coordinate system and then to

    the point, furthest away from the origin of thecoordinate system. In first case, the observed area werethe points which distance from the origin of the

    coordinate system was greater or equal to current

    range. In second case, while the base point was set to

    the point, furthest away from the origin of the

    coordinate system, the observed points which distance

    from the base point was smaller or equal to current

    range. Depending on the number of observed variables,

    the observed area was either defined by line, circle

    (Picture 2) or sphere.

    The following description of the experiment is limited

    to the case of selecting the point, farthest from the

    origin of coordinate system as the base point as thismodel achieved better results although the difference

    was not significant.

    In case of observing single variable, while its value

    was gradually increased from minimum value, 0, to

    Picture 2 - Examples of observation area

    for 2 input variables, marked with grey

    colorrelative number of neighbors

    churned before and relative duration of

    calls to neighbors that churned before

    for base points set to the origin of

    coordinate system (a) and the point,furthest away (1,1) (b). In case of (a), the

    observation area are the point, which

    distance from the base point (0,0) is

    larger or equal than R, whereas in case

    of (b), the observation area are the point,

    which distance from the base point (1,1)

    is smaller or equal to R.

    maximum value, 1, the users churn states inside and

    outside range were observed and the percentage of

    churners and non-churners inside and outside the rangewas calculated. Similarly, in case of combinations of 2

    independent variables, the circle center was set at point(1,1) and the circle radius, which represented the

    observed range was gradually increased from base

    point, to value 2, where both observed variablesreached maximum value, 1. In case of sphere, the

    center was also set at point (1, 1, 1) and the radius was

    increased from 0 to3.

    = (1)

    = (2 ) + ; = (0,0)

    (1 ) +(1 ) ; = (0,0)

    Having calculated the percentage of churners and non-churners inside and outside the observed range for each

    combination of input variables at each step, it is

    possible to draw conclusions for how observed users

    24

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    4/6

    interaction with his neighbors and neighbors churn

    affect the observed users decision about churn. Tomeasure the relevance of different segmentations,

    based on the combination of independent variables, we

    used some standard data mining terms, which are

    described in the following chapter.

    7. RESULTSIn this section, we present the results of our experiment

    for different observational time period lengths,

    comparing different combinations of observed

    variables for churners segmentation.

    For each defined time period length, we present and

    discuss the results of all combinations and present the

    Receiver Operating Characteristic (ROC) along with

    precision and recall values for selected fraction of

    segmented users.The aforementioned factors are defined as follows. Let

    TP be the true positives, TN the true negatives, FP the

    false positives and FN the false negatives. In thisexperiment TP represents the number of churners,

    captured inside the observed range, FP non-churners

    inside the observed range, TN non-churners outside the

    observed range and FN churners outside the observed

    range. Precision is defined as the fraction of retrieved

    instances that are relevant (Equation 2), while recall is

    the fraction of relevant instances that are retrieved

    = = (2)

    = = (3)

    Having calculated precision and recall values for

    different rates of population, captured inside theobservational area, it is possible to represent these

    values graphically (Picture 3) and interpret the

    significance of segmentation against random selection.

    In case of labeling all users as churners, the precision

    would be equal to 1 and the recall value would be equal

    to 1 as all churners among all users would be selected.In case of random selection, the precision value is

    always close to actual churn rate, while the recall value

    increases equally with selected population size,

    assuming that churners and non-churners are equally

    distributed among population. In case of defining the

    criteria to select a specific segment of population with

    aim to increase the precision and recall values, thesegmentation efficiency could be measured by

    comparing values for segmented users with values for

    random selection. Certainly, it is very difficult to

    design a perfect segmentation model, valid for generalusage and therefore in real models, there is a certain

    amount of samples, in this case non-churners, who are

    classified as churners.

    Picture 1Precision and recall values for random selection, ideal segmentation and segmentation

    with observation of prior neighbor churners rate, relative number of calls to neighbors that

    churned before and relative duration of calls to neighbors that churned before with time period

    length 60 days

    25

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    5/6

    Picture 4 ROC curve and AuC, colored with grey for observation of prior neighbor churners

    rate, relative number of calls to neighbors that churned before and relative duration of calls to

    neighbors that churned before with time period length 60 days

    Beside the precision and recall values, a usefulmeasure for model evaluation is the Area under

    Curve value (AuC), which is derived from Receiver

    Operating Characteristic (ROC)[4]. ROC curve is

    a graphical plot which illustrates the performance of

    binary classification for different rates of captured

    users and enables the observer to visually estimatethe cost/benefit ratio of the segmentation model for

    selected size of population. The best possible

    prediction method would yield a point in the upper

    left corner or coordinate (0, 1) of the ROC space, inwhich case there would be selected all actual

    churners (TP) and none of non-churners selected(FP). The actual ROC curve values depict relative

    trade-offs between benefits from selecting actual

    churners and cost from classifying actual non-

    churners as churners (FP). The AuC value is a

    proportion of the area of the unit square under ROCcurve and is equivalent to the probability that the

    classifier will rank a randomly chosen positive

    instance (churner in this case) higher than a

    randomly chosen negative instance (non-churner).

    As random guessing produces the diagonal line

    between (0,0) and (1,1), which splits the wholeobservation space in half, the AuC value for

    random guessing is equal to 0.5. As in this

    discussion we observed the difference between

    random guessing and using the classification

    model, we calculated the area size between random

    guessing curve and actual classification model ROC

    curve (AuC) as it is shown in [4]. Similarly,

    besides actual AuC value, derived from ROC curve,

    we used recall value of random guessing and actual

    model to calculate the area size between actual

    model recall curve and random guessing recallcurve.With adjusting the capture range, which in

    this case represents a side of rectangle in case of

    observing 1, circle radius in case of 2 and sphere

    radius in case of 3 dimensions, it is not possible to

    capture the exact percentage of users, and therefore,

    for model estimation we used the AuC. Theprecision, recall, AuC and AuC (Recall) values

    for different combinations of input variables, for

    observation period length of 60 days are listed in

    Table 1.

    Dimensions AuC Precision (% of users) Recall (% of users) AuC (Recall)~5%

    (actual)

    ~10%

    (actual)

    ~5% ~10%

    x 0.13075 0.0377(5) 0.0237(9.6) 0.2814 0.3397 0.12987

    y 0.1353 0.0419(4.93) 0.0299(7.3) 0.3081 0.3264 0.1344

    z 0.1368 0.0443(4.72) 0.0263(8.55) 0.3126 0.3362 0.13588

    x, y 0.13395 0.0425(4.86) 0.0238(9.53) 0.3084 0.3393 0.13305

    x, z 0.13466 0.0422(4.99) 0.0238(9.55) 0.3147 0.3393 0.13376

    y, z 0.13643 0.0446(4.64) 0.0271(8.25) 0.3095 0.334 0.13551

    x, y, z 0.02355 0.0425(4.95) 0.0236(9.64) 0.3143 0.3397 0.02359

    Table 1 - 1Model performance for the time period length of 60 days, using different

    combinations of variable inputs, where x represents the relative number of neighbors that churnedbefore, y relative duration of calls to neighbors that churned before and z relative number of calls

    to neighbors that churned before

    26

  • 8/13/2019 PHONE CALLS CONNECTIONS RELEVANCE TO CHURN IN MOBILE NETWORKS

    6/6

    Comparing the results for different period lengths, we

    can see that the number of calls to neighbors that

    churned before, relative to total number of calls as

    single input variable is the best predictor for period

    lengths of 60 and 30 days, while it is close to the best

    one in case of 15 days period length as well, where the

    best predictor is th relative duration of calls toneighbors that churned before.

    By comparing model performances for different

    period lengths and same combinations of input

    variables, we can see that the model achieves the bestvalues in case of 60 days period length.To describe the

    model usefulness in practice, we can consider the case

    of observing the relative number of neighbors that

    churned before and relative number of calls to

    neighbors that churned before for the time period of

    60 days. For this case, if we set a range threshold tovalue, for which around 5% (4.99) of segmented users

    are captured and treated as churners, the AuC value

    is equal to 0.13466, precision is equal to 0.0422, recallis equal to 0.3147 and AuC (Recall) is equal to

    0.13376.

    8. CONCLUSIONS AND FUTURE

    RESEARCH DIRECTIONS

    Where the majority of population has already adopted

    mobile services, it is critical to implement churn

    prediction methods, in order to retain the market

    share. Besides observing user behavior as individual,

    it is crucial to discover patterns and rules that hold for

    network of interconnected users. In this research, we

    proved that observed users behavior in terms of churn

    depends of his neighbors prior behavior.In case of observing users as individuals, many users

    importance might be overlooked; from billing records

    the service provider can measure users importance

    from the amount of monthly charges whereas theusers, who are not active, in this case do not stand out

    but are nevertheless important in case of receiving

    many incoming calls and therefore indirectly generate

    significant profit as well. If the churn of such users

    was prevented, the spread of churn to active users,

    who directly generate profit, could be prevented by

    targeting the influential neighbors.

    As the existence of the influence among connected

    users has been proved, our plan for the future researchis to observe the connection in the longer time period

    and distinguish the contribution of influence to

    observed user of each neighbor separately.

    9. ACKNOWLEDGEMENTS

    The authors would like to thank Telekom Slovenijefor cooperation. The work was supported in part by

    the Ministry of Education, Science, Culture and Sport

    of Slovenia and the Slovenian Research Agency.

    Special thanks go to the European Union for partly

    financing a young researcher training program from

    the European Social Fund, under the Operational

    Programme Human Resources Development for theperiod 20072013.

    10. REFERENCES

    [1] K. Dasgupta, R. Singh, B. Viswanathan,

    D. Chakraborty, S. Mukherjea, A. A.

    Nanavati, and A. Joshi, "Social ties and

    their relevance to churn in mobile

    telecom networks," presented at the

    Proceedings of the 11th internationalconference on Extending database

    technology: Advances in database

    technology, Nantes, France, 2008.

    [2] L. Alberts, I. R. L. M. Peeters, R.

    Braekers, and C. Meijer, "Churn

    Prediction in the Mobile

    Telecommunications Industry," Citeseer.

    [3] T. Dierkes, M. Bichler, and R. Krishnan,

    "Estimating the effect of word of mouth

    on churn and cross-buying in the mobile

    phone market with Markov logic

    networks,"Decision Support Systems,

    vol. 51, pp. 361-371, 2011.

    [4] T. Fawcett, "An introduction to ROC

    analysis," Pattern recognition letters,

    vol. 27, pp. 861-874, 2006.

    27