logistic regression -
Post on 13-Apr-2017
Embed Size (px)
Prediction using Logistic Regression
Need of logistic regression?
Regression allows us to predict an output based on some inputs. For instance, we can predicts someone's height based on their mother's height and father's height.This type of regression is calledlinearregressionbecause our outcome variable is a continuous real number.
But what if we wanted to predict something that's not a continuous number?
Let's say we want to predict if it will rain tomorrow. Using ordinary linear regression won't work in this case because it doesn't make sense to treat our outcome as a continuous number - it either will rain, or won't rain.In this case, we uselogistic regression, because our outcome variable is one of several categories.
Logistic RegressionRegressionIndependent VariableDependent VariableExampleQuantitative, QualitativeQualitativeQuantitative, QualitativeQuantitativeResult (Pass, Fail) is the function of time given to studyMarks obtained is the function of time given to study
Passing MarksStudy HoursResultPassFail
Binary logistic regression expression
Y = Dependent Variables = Constant1 = Coefficient of variable X1X1 = Independent VariablesE = Error TermBINARY
Problem statement & Methodology
The purpose of campaign is to get 25K customer registered for CRBT. The task can be accomplished by identifying the customers (or prospects) who are most likely to respond out of the total base of around 100 Million users.
We have sample data available for both respondents and non-respondents for the campaign and we used Logistic regression, which allows us to predict a discrete outcome, such as response tracking from a set of variables that may be continuous, discrete, or a mix of any of these. Generally, the dependent or response variable is dichotomous, such as success/failure.
Sample data of Respondents: 13,600 unique subscribers
Sample data of Non Respondents: 14,000 unique subscribers
Hypothesis tests Is an individual predictor variable significant? Is the overall model significant? Is Model A significantly better than Model B?
Dataset used in model:
Outcome variable:1: Responded0: Not Responded
Predictors:Average monthly spendOperating system2G data usage(MB)3G Data usage(MB) GenderIncoming messagesHandset TypeAge on handset
Logistic Function using R
Logistic Regression Interpretation
Predicated probability using our model