Assignment: Homework 4
Instructions: Provide an answer for each of the underlined questions below. Submit this form, and only this form, as your submission for this week.
1. Create a correlation matrix of all variables in the data set.
a) Which pair of variables are the most highly correlated with each other?
b) What reason might there be for the high correlation between these variables?
2. Conduct a regression analysis of customer profitability (profit) as the outcome variable, and the indicator of online use (online) as the predictor variable.
a) What is the predicted profitability among those who use online and those who do not?
b) Is the difference between online and offline customers in the sample indicative of a significant difference in profitability across these groups?
3. What role do customer demographics (age and income) play in analyzing customer profitability for online and offline customers? To answer this question conduct the following analyses and report your results:
a) Repeat the regression analysis for Question 2 above, and include “age” (age) as an additional predictor variable.
i. Is the overall fit of the model (adjusted R2) better with age included or without?
b) Create a variable “demographics” that takes a value “1” if the “Age” and “Income” variables are available for a particular customer, and “0” if these variables are missing. Conduct a regression analysis with customer profitability as the outcome variable and the newly created variable, “Demographics” as the predictor variable.
i. Based on the parameter estimates you receive from your model output, how much of a difference do we see in profitability based on whether we have complete demographics included for a customer?
c) Conduct a regression analysis with customer profitability as the outcome variable and the following predictor variables: indicator of online usage (online), customer tenure (tenure), region 2 (D1200), and region 3 (D1300).
i. Of the predictors in our 3c model, which variable has the biggest effect on profitability?
ii. Of the predictors in our 3c model, which variable has the smallest effect on profitability?
d) Using the model created in 3c, does it appear that our predictor variables provide us with a good model overall of our outcome variable (consider adjusted R2 to answer this question)?
4. Based on what you have learned across all of your analysis in 1-3:
a) Are customer demographics useful in understanding how profitable customers will be?
b) Why do you believe this?