The assignment is to build a predictive model for data supplied by an insurance company, who are interested in identifying customers likely to purchase an insurance policy for a mobile home. The prediction task is motivated by the decision to include customers in mailing. A mail will be sent only to the customers with a high probability of becoming mobile home insurance policy holders.
You have been given a database of existing customers, which you will use for building a predictive model. The client wants you to predict whether a customer will have a caravan insurance policy from other data about the customer. Data about customers consists of 51 variables and includes product usage data and socio-demographic data derived from postcodes. The training set contains 5300 descriptions of customers, including the information of whether or not they have a mobile home insurance policy. A test set contains 4000 customers of whom you don’t know if they have a mobile home insurance policy.
The deliverable for the assignment is a written report describing and justifying the steps you have taken and their results, including charts and numerical results where appropriate.
SOFTWARE TO USE: WEKA
ASSESSMENT CRITERIA/MARKING SCHEME
a) 20% for the response to comments
b) 50% for the first report, including:
c. 10% for demonstrating good understanding of the business problem and its scale,
d. 10% for identifying the potential processes/areas of improvement,
e. 10% for proposing, describing and justifying the solutions to the identified problems,
f. 5% for a clear conclusion to your report, consistent with the preceding discussion and summarising your main arguments.
c) 30% for the second report and the developed model, including:
a. 5% for the style of the report,
b. 10% for following a logical model development process,
c. 10% for the description of the process, where it should be made clear how your findings at a preceding step inform the next steps you take.
1. Witten, I. and Frank, E. and Hall, M. (2011) “Data Mining: Practical Machine Learning Tools
/* Style Definitions */
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
Part I. Report (indicative maximum 1,500 words)