dadm-secb
TRANSCRIPT
-
8/2/2019 DADM-secB
1/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
DATA AN ALYS I S &D E C I S I O N M A K I N G
PGP-I | 2010-2012Prof. Utpal Bhattacharya
PILGRIM BANK: CUSTOMERPROFITABILITY
ASSIGNMENT Submitted by:
Group 2 | Section B
Ravi Shankar Niranjan
Felix G
Chanakya Levaka
Satyam Gupta
Dipayan Roy
Amit Kumar Bhil
Raja Vijay S
SUBMITTED ON 11 T H MARCH, 2011INDIAN INSTITUTE OF MANAGEMENT INDORE
1 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
2/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
Executive Summary
Joe Greene, a new manager at Pilgrim Bank wants to better understand
profitability data for banks customers. He needs to answer the decision
problem of whether charging fees for online banking use is more profitable for
Pilgrim Bank than offering incentives to promote wider use of the online
channel. To begin solving the problem, Mr. Green first must address the
following research issues: how much more/less profit do online users generate;
is this difference significant, what are the measures of customer profitability,
what are the characteristic of the banks online users and profitable
customers, what are the costs of operating the online banking channel, and
finally what measures does the bank take to retain its most profitable
members.
Data Collection & Methodology
Greene collected the information about the research design in two-part. The first is an informal qualitative meeting with analyst Jane Raines. The
purpose of this research is to obtain any useful general knowledge on
measures of profitability, customer behavior, cost structure, profitability
management and their relations with each other. The second part is an in-
depth qualitative research based on statistical analyses on a database of
customer profits, online usage, demographics (age, income, geographic), and
tenure years.
The meeting with Jane Raines was an eye-opener for Greene as he
learned the customer profitability is given by:
(Balance in deposit Accounts) * (Net Interest Spread) + (Fees) +
(Interest from Loans) - (Cost to Serve)2 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
3/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
The total cost is further divided into variable costs and fixed costs.
Variable costs are lower for online transactions, but it has a higher fixed cost
structure. Mr. Green also finds that there is no clear correlation between
balance amounts and customer profitability. Lastly, Alan learns about the
initiatives Pilgrim Bank take to increase profitability and retain its most
profitable customers.
Data Analysis
Data analysis of the customer database begins with the testing for samplebias. Customers are sorted from descending profitability and they are charted
against percent cumulative profitability of the bank. Alan finds congruency
between his findings and the one results presented by Jane Raines, thus finds
reassurance that his sample is not biased. The results also confirm that
roughly 10% of the customers constitute 70% of Pilgrim Banks profits. He
then proceeds to summarize the statistics and finds that, on average, online
users are more profitable than non users ($116.36 versus $110.79.) Thesummary of the statistics also include standard deviation. The mean and
standard deviation is not calculated for geographic information since it is a
nominal scale.
3 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
4/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
Plotting a frequency chart yields results where the majority of profits lie
in the -$100 to $200 range. By examining the samples, Green concludes that
only a little more than half of the customers are profitable, and more
importantly, a little over 20% of Pilgrims Bank customers contribute to 100%
of total profits. This shows the importance of retaining existing highly
profitable customers, but it also shows that there is plenty of room to make
non-profitable customers valuable.
Regression Analysis
Dorstamp provide Greene with 31,634 sample data to analyze. But some
of the attributes like age bucket and income bucket have incomplete
information. By cleaning up those samples, we get 22812 samples to workupon
1. Regression analysis with profit as the dependant variable and
online usage as the independent variable
4 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
5/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
Regression StatisticsMultiple R 0.006R Square 0.000
Adjusted RSquare 0.000
Standard Error 282.857
Observations 22812.000
ANOVA
Df SS MS F Significance F Regression 1.000 64359.478 64359.478 0.804 0.370
Residual 22810.00
0
1824986620.
02080008.182
Total 22811.0001825050979.
498
Coefficients
StandardError t Stat P-value
Lower 95%
Intercept 126.522 2.007 63.033 0.000 122.5879Online 5.003 5.578 0.897 0.370 -5.930
Regression equation: Profit = 126.522 + 5.003(Online)
The adjusted r-squared value is very close to 0, meaning that the best
fit line does not accurately estimate the relationship between profit and online
usage. It also translates to a possibly poor regression model where important
variables are left out. The most significant information derived from the
regression is the p-value. The associated Ho is 2 = 0 and Ha is 2 0 in the
model y = 1 + 2x+ where y is profit, 1 is the intercept, 2 is the co-
efficient of online usage and is the standard error. The 0.370 p-value is
much greater than the significance level of 5%, thus we do not have enough
evidence to reject Ho. In other words, we do not know for sure if online usage
significantly affects profit.
5 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
6/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
2. Regression analysis with profit as the dependant variable and
online usage, Age, Income, tenure, Location as the independent
variables
Regression StatisticsMultiple R 0.240R Square 0.057
Adjusted RSquare 0.057
Standard Error 274.644Observations 22812.000
ANOVA
df SS MS F Significance F
Regression 5.000 104804631.889 20960926.378277.8
87 0.000
Residual 22806.000 1720246347.609 75429.551
Total 22811.000 1825050979.498
Coefficients
StandardError t Stat
P-value
Lower 95%
Intercept -103.924 46.831 -2.219 0.026 -195.7179Online 18.242 5.509 3.311 0.001 7.4449Age 18.288 1.246 14.682 0.000 15.8479Inc 17.842 0.785 22.734 0.000 16.303
9Tenure 4.028 0.236 17.083 0.000 3.5669District 0.010 0.038 0.263 0.792 -0.065
y = 1 + 2(online usage)+ 3(Age)+ 4(Income)+ 5(Tenure)+
6(District)+ .
Y= -103.924+18.24(online usage) + 18.288 (Age) + 17.84
(Income) + 4.028 (Tenure) + 0.010 (District) + .
We accept Ha ( 2 0), and conclude online usage significantly affects
profit. Looking at the other p-values, it should also be noted that there is very
strong evidence that age and income and tenure are both related to profit.6 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
7/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
Not surprisingly, there does not seem to be any proof supporting geographic
region as a significant estimator of profit.
In trying to explain why the p-value becomes significant in the latter
regression, we need to first look at correlation values of online usage and the
demographic variable. The correlation matrix reveals that age is slightly
negatively correlated with online usage and income has an extremely small
positive correlation. This information seems reasonable since younger
generations are more computer savvy, and some income is required to have
computer and internet access, plus the education to be computer literate.
Thus, when these variables are factored in the regression, we obtain a truer
effect of online usage on profit, and consequently, enough evidence to reject
the null hypothesis.
To investigate if there is any systematic difference between consumers
with complete demographic records and those who dont, basic descriptive
statistics should first be reviewed. Profit and online usage averages are
127.18 and 0.1295 respectively for data with demographic records, and 72.95and 0.104 for incomplete records. There is a big difference of more than 50 in
the two means. Nonetheless, a hypothesis testing is required to see if the
difference is significant.
Sampl
es
Samples With Complete
Demographics
Samples With Incomplete
Information
Profit Online Usage Profit Online Usage
Mean 127.18 0.129 72.95 .104
Deviati
on282.87 0.335 243.01 0.304
7 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
8/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
The results from a hypothetical testing conclude that the differences for both
profit and online usage are material.
This raises the question of external validity. The regression analysis
shows that there is a positive effect on profit from online users, but the
analysis omits the data without complete demographic records, thus we need
to excise caution when generalizing conclusions to the entire population.
]
Conclusion
To effectively implement the online banking promotion strategy, we
need to determine any significant characteristics of online users. The results
indicate that age and income are significant variables, while geographic region
is not. Since the age co-efficient is negative, Pilgrim Bank should focus more
efforts to younger customers to migrate them to the online channel. The
same should be done with customers in higher income brackets since theincome co-efficient is positive.
Although we have determined the statistical significance of online users,
the economical significance should also be reviewed. Offering incentives to do
online banking will increase the load of the online channel. The management
of Pilgrim Bank must carefully assess the estimated increase in online usage
after the promotions to see if existing infrastructure can support the extra
load. All costs associated with the increase of online bankers, including anynew infrastructure needed to be built, will need to be compared with the
expected increase in profit to determine the net value. Only then will the
economic value of this strategy be entirely addressed.
Retain the 20% profit-making customers
8 Data Analysis and Decision Making
-
8/2/2019 DADM-secB
9/9
Amit | Chanakya | Dipayan | Felix | Ravi | Satyam | Vijay (Group2-SecB)
2011
Use incentives to improve the transaction costs of the other 80% and
make them profitable
Theres a small positive correlation between profit and online-usage, so
consider the economic benefits before employing online rebates
Rope in more young investors as age has a negative correlation to profit
due to the tech-savvy mind of the young users.
Rope in more income investors as more the income, more the profit (a
positive correlation) No correlation between profit and district-wise usage.
The more the customer retains the same bank (tenure), the more the
profitability, hence try to add incentives to retain customers
9 Data Analysis and Decision Making