changwon nat i univ. isie 2001 sofsem06 a personalized recommendation system based on prml for...
TRANSCRIPT
Changwon Nati Univ. ISIE 2001
SOFSEM’06
A Personalized Recommendation System A Personalized Recommendation System Based on PRML for E-CommerceBased on PRML for E-Commerce
Young Ji Kim, Hyeon Jeong Mun,
Jae Young Lee and Yong Tae Woo
Dept. of Computer Sciences, Kosin University, Korea
SOFSEM’062
PersonalizationPersonalization
What’s Personalization?– The process of customizing the contents and structure of a web
site to the specific and individual needs of each user taking advantage of the user’s behavior patterns.
Why need Personalization?– Technique to maintain closed relationships with clients.
• analyzing clients preferences.
• providing differentiated service to preferred clients for Internet based applications.
– Important role in a one-to-one marketing strategy to enhance both customer satisfaction and profits on an E-commerce site.
SOFSEM’063
PersonalizationPersonalization
What is the need for personalization?– Need to know client’s preferences.
• What did clients buy?
• What did clients want or like?
• What things will the client be interested in?
– Steps to personalization.• Collect user’s behavior.
• Analyze user’s behavior from collected data.
• Predict user’s behavior using analyzed results.
• Recommend things which client will be interested in.
SOFSEM’064
Personalized Recommendation Personalized Recommendation SystemSystem
What’s a personalized recommendation system?– Analyze user’s behavioral patterns and recommend new products
that best match the individual user’s preferences.
Existing recommendation techniques– Rule-based filtering technique
• Use demographic information
– Collaborative filtering technique• Use other user’s rating value with similar preference
– Content-based filtering technique• Compare user profile and product description
– Item-based filtering technique• Analyze association among products
SOFSEM’065
Personalized Recommendation Personalized Recommendation SystemSystem
Problems of the existing techniques– Some users are concerned about privacy issues
• Do not enter personal information.
• Enter incorrect information.
– Not easy to dynamically incorporate time-varying aspects of user preference using on existing log file.
– Existing log file does not contain enough personal information.
– Existing methods are tailored to particular applications.
– Lack ability to analyze user behavior patterns.
– Lack ability to dynamically generate and recommend web contents.
SOFSEM’066
Proposed SystemProposed System
Proposed system– Propose a new personalized recommendation technique based on
PRML.
– First, we make each user’s PRML instance.• User’s behaviors are collected from XML-based web sites.
• Save them as PRML instance.
– Second, we build each user’s profile.• Analyze each user’s PRML instance.
• Make each user’s profile using them.
– Third, we recommend the products with Top-N similarities.• Personalized recommendations are made by comparing the
similarity between the information about new products and user’s profile.
SOFSEM’067
Proposed SystemProposed System
SOFSEM’068
Personal Information Collection Personal Information Collection SystemSystem
What’s PICS(Personal Information Collection System)?– Collect user’s behavioral patterns while a user is connected.
• When the user connect.• Where the user connect.• What the user do.
– click, read and scrap contents, use shopping cart, purchase, etc.
– Save it as PRML instances.
Existing method to collect user’s behavior– Need to extract individual user's behavior patterns from mass web l
og. – Various web log formats such as CLF(Common Log Format), IIS,
W3C Ext. have been used in different web servers to record log information.
SOFSEM’069
Personal Information Collection Personal Information Collection SystemSystem
Existing method to collect user’s behavior
SOFSEM’0610
Personal Information Collection Personal Information Collection SystemSystem
Existing method to collect user’s behavior– Need to preprocess step such as referred in previous section.
– Use different log formats and need to remove unnecessary data such as images or scripts.
– Difficult to extract session information to identify an individual user.
– Difficult to collect user’s behaviors in real time.
Proposed PICS– Implement to collect the personalized information from individual
client's behaviors in real time.
– Save personalized information as PRML instances.
SOFSEM’0611
Personal Information Collection Personal Information Collection SystemSystem
Configuration of personal information collection system
SOFSEM’0612
PRML for Personalized ServicesPRML for Personalized Services
What’s PRML?– Personalized Recommendation Markup Language.
– To efficiently store and manage individual client’s behaviors.
Conceptual diagram of PRML schemaPRMLPRML
User IdentificationInformation
User IdentificationInformation
USERUSERUSERUSER
CBR-Based Feature InformationCBR-Based Feature Information
User Request/Server Response
User Request/Server Response
1…m
1…m
0…m
Implicit rating InformationImplicit rating Information
0…m
SOFSEM’0613
User Session Management ModuleUser Session Management Module
Purpose– To effectively identify and manage user information.
What does it do?– An agent at the server side collects user access information from
each user session.• User ID, session ID, IP address, URL, server status and etc.
– Convert user access information to PRML instance.
– PRML instance is summarized into user identification information and log information.
– Save the PRML instance in XML database.
SOFSEM’0614
User Session Management ModuleUser Session Management Module
Schema structure of personal identification information section in PRML
SOFSEM’0615
User Session Management ModuleUser Session Management Module
Example of personalized identification information section in PRML instance
<?xml version="1.0" encoding="UTF-8"?> <PRML xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance xsi:noNamespaceSchemaLocation
= "http://www.hibrain.net/prml/PRML.xsd"> <USER ID="gdhong”> <SESSIONID ID="JHPWDWORDS" LOGIN_DATE= "2005/06/26 10:21:58" LOGOUT_DATE=" 2005/06/26 10:40:12 "/> <IPADDR IP="203.246.6.121"/> <AGENT TYPE="Mozilla/4.0"/> <REQUEST_SET> <REQUEST> <REQUEST_URL URL="/serviet/RecruitManager?RecruitCmd=RecruitSummaryView"/> <TIME DATE="2005/06/26 10:23:22"/> <BYTES SIZE="1024"/> <HTTPCODE METHOD="GET" NAME="HTTP/1.1" STATUS_CODE="200"/> ………………………….. </REQUEST> <REQUEST>………………….</REQUEST>…………. </REQUEST_SET> </USER> </PRML>
SOFSEM’0616
Implicit Rating Information Implicit Rating Information Collection ModuleCollection Module
Purpose– Implicitly collect rating information from XML-based web sites
utilizing hierarchical characteristics of XML documents.
Preparation– Elements in the XML documents are assigned different weights
based on their importance in the documents.
– Store these weights in the element weight database.
What does it do?– When a user visits a web site, the module collects the XML
elements in the XML contents which the user accessed.
– Save them as PRML instance.
SOFSEM’0617
Implicit Rating Information Implicit Rating Information Collection ModuleCollection Module
Configuration of implicit rating collection technique
Schema of implicit rating information collection section
SOFSEM’0618
Experimental XML documentExperimental XML document
XML schema structure of faculty contents
SOFSEM’0619
Experimental Experimental Element Weight DatabaseElement Weight Database
Element weight database– In the element weight database, each element has a level weight
and element weight.
– The level weight of an element.• Determine by its position in the hierarchy of the XML documents.
– The element weight of an element.• Reflect the importance of XML documents.
An experimental element weight database
SOFSEM’0620
Implicit Rating Information ModuleImplicit Rating Information Module
SOFSEM’0621
CBR feature Information CBR feature Information Collection ModuleCollection Module
Purpose– Collect CBR feature information to extract user’s preference on
web site contents.
Preparation– Select feature elements.
• Some elements in an XML document are considered important characteristics.
– Store them in the characteristics of XML document database.
What does it do?– When a user accesses XML document, the feature information in
the XML document is collected. – Save it as PRML instance along with the user’s implicit rating
information.
SOFSEM’0622
CBR feature Information CBR feature Information Collection ModuleCollection Module
Configuration of CBR feature collection technique
Schema structure of CBR feature collection section
SOFSEM’0623
CBR feature Information CBR feature Information Collection ModuleCollection Module
SOFSEM’0624
Proposed Personalized Proposed Personalized Recommendation SystemRecommendation System
Personalized Recommendation System– Use a CBR-based learning technique.
– Create user profile based on the PRML instance and save in the user profile database.
– Compute the similarity between the user profile and each new product.
– Recommend to the user the new products with Top-N similarities.
SOFSEM’0625
Proposed Personalized Proposed Personalized Recommendation SystemRecommendation System
Configuration of proposed system using CBR technique
Personalized RatingInformation Calculation
Module
Element weightDatabase
SOFSEM’0626
Personalized Rating Information Personalized Rating Information Calculation ModuleCalculation Module
Purpose– Compute user’s preference of each contents a user accessed.
• Use implicit rating information collection section in the PRML instance and element weight database.
Steps to calculate implicit rating information– Group all the elements by content’s id.
• all the elements collected by the implicit rating information collection module are divided into groups based on their contents.
– Retrieve element weights and level weights from the element weight database.
– Compute rating information of the each contents.
SOFSEM’0627
Personalized Rating Information Personalized Rating Information Calculation ModuleCalculation Module
Rating information of the content
– V is the set of elements in the XML content the user accessed.
– le is the level weight of the element e.
– ke is the element weight of e.
– Rc is the implicit rating information.
eVe
ec klR
SOFSEM’0628
CBR-based Learning techniqueCBR-based Learning technique
Traditional case-based reasoning system– When a new problem appears, the system retrieves the most
similar case, reuses the case to solve the problem.
– Revises the proposed solution if necessary, and retains the new solution as a part of a new case.
Proposed the CBR-based Learning technique– Make users profile analyzing user’s behavior patterns.
– Suggest the recommendation of the most similar ones using the past preference information stored in the user profile.
– Update the user profile for learning the new case.
SOFSEM’0629
User Profile Management ModuleUser Profile Management Module
Select contents– Select contents whose implicit rating value(Rc) is high.
• Build user profile using CBR feature information refer to selected contents.
User profile– P = (u, A, R, D)
• u is a user ID.
• A is the set of attributes in the web contents.
• R is a set of intra-attribute weights.
• D is a set of inter-attribute weights.
SOFSEM’0630
User Profile Management ModuleUser Profile Management Module
Intra-attribute weights– The intra-attribute weights R of Ai is {ri1, ri2, ···, rim}.
• kij is the number of times aij is accessed.
• rij represents how much a user prefers the attribute value aij to other
attribute values.
,1
m
p ip
ijij
k
kr i = 1, 2, ···, n, and j = 1, 2, ···, m.
SOFSEM’0631
User Profile Management ModuleUser Profile Management Module
Intra-attribute weights
User profileUserid (u) gdhong
Attribute(A)
AttributeValue
(ai1..aim)
AppearCount
(kij)
Intra-attribute weight
(R)
Inter-attributeWeight
(D)
Major
Database 7 - -
Animation 1 -
Network 2 -
position
Professor 4 - -
Researcher 3 -
Post-Doc 3 -
Location Pusan 2 - -
Seoul 8 -
rij ?
Compute rij of A1(Major)
Attribute value Appear count
Intra-attributeweight
a11 Database k11 7 r11 0.7
a12 Animation k12 1 r12 0.1
a13 Network k13 2 r13 0.2
SOFSEM’0632
User Profile Management ModuleUser Profile Management Module
Inter-attribute weights– The inter-attribute weights D of A is {d1, d2, ···, dn}.
• each di represents how much Ai is preferred by the user.
– If di is large,
• the attribute Ai is more important to the user than other attributes.
SOFSEM’0633
User Profile Management ModuleUser Profile Management Module
Inter-attribute weights
d1 of Major(A1) = 0.7 – (1/3) = 0.4 d2 of Position(A2) = 0.4 – (1/3) = 0.1 d3 of Location(A3) = 0.8 – (1/2) = 0.3
each di of Ai(Attribute)
Attribute Inter-attribute Weight
A1 Major d1 0.4
A2 Position d2 0.1
A3 Location d3 0.3
User profileUserid (u) gdhong
Attribute(A)
AttributeValue
(ai1..aim)
AppearCount
(kij)
Intra-attribute weight
(R)
Inter-attributeWeight
(D)
Major
Database 7 0.7 -
Animation 1 0.1
Network 2 0.2
Position
Professor 4 0.4 -
Researcher 3 0.3
Post-Doc 3 0.3
Location Pusan 2 0.2 -
Seoul 8 0.8
di ?
SOFSEM’0634
Contents Recommendation ModuleContents Recommendation Module
Contents Recommendation Module– Analyze individual user’s behavioral pattern to generate recomm
endation for the user.
– Use nearest-neighbor approach to compute the similarities between the attributes of user profile(P) and new products(I).
To compute similarity
• aij is the attribute value of Ai in P
• a’ij is that of I
• if aij = a’ij , f (aij, a’ij) returns 1 and otherwise, 0.
SOFSEM’0635
Experimental ResultsExperimental Results
Experiment– Experimental content
• XML contents of a faculty position recruiting web site.
– Number of User • 824 person.
– Accessed contents• 1,144 XML faculty contents.
– New contents• 1,484 faculty contents.
SOFSEM’0636
Experiment for Personal Experiment for Personal Information Collection SystemInformation Collection System
PRML instance
SOFSEM’0637
Experiment for Proposed Experiment for Proposed Recommendation SystemRecommendation System
User profile
User profile
Userid (u) gdhong
AttributeOf item
(A)
AttributeValue
(ai1..aim)
AppearCount
(kij)
Intra-attribute weight
(R)
Inter-attributeWeight
(D)
Major
Database 7 0.7
0.4Animation 1 0.1
Network 2 0.2
Position
Professor 4 0.4
0.1Researcher 3 0.3
Post-Doc 3 0.3
LocationPusan 2 0.2
0.3Seoul 8 0.8
SOFSEM’0638
Experiment for Proposed Experiment for Proposed Recommendation SystemRecommendation System
Experimental Results of recommendation– Use MAE(Mean Absolute Error) and ROC(Receiver Operating C
haracteristic)
Existing Method vs. Proposed Method
0123
MAE Sensit ivity Specificity Accuracy Error rateRating Method
Rat
ing
Val
ue
DemographicCFProposed
SOFSEM’0639
ConclusionConclusion
Proposed System– Personalized recommendation system
– Use the PRML approach.
– Define the inter-attribute weights and intra-attribute weights.
– Build user profile based on the behavioral patterns of a user.
– Recommend the products with Top-N similarities.
Future work– Research a Personalized recommendation system using ontology.
• Research User Ontology extending the proposed user profile.
• Research Domain Ontology to represent content’s feature.
• Research Log Ontology to represent user’s behavior patterns.