1 business system analysis & decision making – data mining and web mining zhangxi lin isqs...

21
1 Business System Analysis & Decision Making Data Mining and Web Mining Zhangxi Lin ISQS 5340 Summer II 2006

Upload: austin-terry

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

1

Business System Analysis &

Decision Making – Data

Mining and Web Mining

Zhangxi LinISQS 5340

Summer II 2006

2

Outline

Estimating the propensity to buy Online recommendation

3

Estimating the Propensity to Buy

4

Goals for Building Propensity-to-Buy Models

Use propensity-to-buy scores for personalization and to influence dynamic Web content

Test interventions that are intended to increase the probability of the user making a purchase

Use propensity-to-buy models to evaluate banner ads with respect to increasing the propensity to buy at the target Web site

5

Issues Related to Propensity-to-Buy Models

Web log data alone is insufficient to build these models.

Implementation issues and the use of dynamic inputs need to be resolved before deciding on candidate models.

Binomial response modeling is required, but as an alternative, you can predict the amount spent in the current session, or you can employ a two-stage model that predicts both propensity to buy and amount spent.

6

7

Variable Definitions

8

Variable definitions (Cont’d)

9

SAS Enterprise Miner Model

Dataset Dataset inpartitioned

Modeling in aDecision tree

Assess theoutcome

10

Decision Tree

11

Lift Value

12

Captured Response

13

Improving a Model

Add more inputs that may influence the target. Experiment with transforming inputs. Combine predictions using an average of two

or more models. Oversample when the success rate is small,

as it is in the propensity-to-buy example. Incorporate Bayesian concepts into the

modeling process: prior distributions, profit and loss matrices (use the target profile tab/option in Enterprise Miner).

14

Online Recommendation

15

Statement of the Problem

There are M judges and N items. Each judge j will have assigned a rating R(j,k)

to item k if the judge has evaluated item k, otherwise the rating will be missing.

For all of the missing ratings, estimate the rating, or score item k for judge j.

Use the scores to estimate the top S scoring items that each judge has not rated.

16

Netflix™Movie Rating

17

18

Applications of Recommender Systems

For customers visiting a retail Web site, use information from previous purchases to recommend

Books Music CDs Movies

An “intelligent” music player: plays music specifically selected by user, when music has finished and user has not made a selection in over L seconds, the player makes a selection for the user based on previous selections the user has made.

A news service that provides a personalized custom virtual newspaper to the subscriber based on past news article preferences. (These are usually content-based rather than collaborative.) continued...

19

Applications of Recommender Systems

Personalize a user’s home page with “interesting” links, with links based on a recommender system algorithm that recommends links that should be interesting to the user.

Send a robot out looking for specific information, score each Web page using a recommender algorithm, and then return the K most interesting Web pages sorted by descending score (search engine applications).

Index a library of information based on recommender system scores.

20

Issues

Some applications will not have ratings, but rather 0/1 or No/Yes settings, for example, Yes, the customer has purchased the title, or No, the customer has not purchased the title.

Customer preferences may change over time, or a customer may discover a new artist, so re-training at regular intervals will be required for many applications.

21

A Conditional Frequency Approach

This method may be the current favorite, but it is generally considered to be inferior to more sophisticated approaches.

Obtain a subset of your recommender data that only has customers who purchased the given item, call it item K. Ratings are 0 (did not buy) or 1 (bought).

Derive the frequency distribution by all other titles. Sort from highest frequency to lowest. Recommend a set number of the highest frequency

items. Most Web sites present the top 3 or the top 5 most frequent items.

Example: Amazon.com