using personal-characteristic and friend-ranking in blog search
DESCRIPTION
Using Personal-Characteristic and Friend-Ranking in Blog Search. D93944003 趙國成 D97921018 陳信宏 2009/1/9. Outline. Scope Problem Solution Evaluation Conclusion & Future Works. Outline. Scope Problem Solution Evaluation Conclusion & Future Works. Scope. - PowerPoint PPT PresentationTRANSCRIPT
Using Personal-Characteristic and Friend-Ranking in Blog Search
D93944003 趙國成D97921018 陳信宏
2009/1/9
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Scope
The search targets are at the document level, i.e., entries of a feed.
The search target are text only. (no photo, movie, audio, etc.)
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Specialties of blog
Each article belongs to a specific category. Each article belongs to a member who has
his characteristic like interests. Members may have friends, hence forms a
social network.
Problem
How to adopt these information to improve the searching effectiveness ?
Category Personal Characteristic Friend relation
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Solution
Query = Keyword + Category Weighting of People Characteristic
How much articles he has posted ? Is his interest falls into the queried category ?
Weighting of friend Are his friends also interested at the queried
category ?
More Precise Definitions
AVGpsn N
mN
mN
cmNcmR
)(
)(
),(),(
),(1),(1)()( cmRcmRdRdR fndpsndoc
Final Ranking
Result of LM Personal Ranking Friend Ranking
member category
||
),(
),(F
cfR
cmR Ffpsn
fnd
Implementation Steps
1. Define categories
2. Crawl pages from blog sites by each category
3. Generate the LM Model of the documents in each categories.
4. Generate the member-page mapping.
5. Generate the member-friend mapping.
Define Categories
We empirically define 13 categories. We hope the categories are mutually
independent.
1, 創作2, 旅遊3, 美食4, 醫療保健5, 運動6, 影視7, 生活休閒
8, 科學科技9, 動漫電玩10, 學習11, 財經12, 社會政經13, 其它
Crawl pages from blog sites by each category Many blog websites provide the function of
browsing by category. But not everyone. We crawl the pages from websites providing
this function as the training documents. For other documents, we use text
classification algorithm to decide their categories.
Generate the member-page mapping
In almost all the blog websites, the URL of each page containing the member-id information.
http://www.wretch.cc/blog/ddedogtoootw/9759034http://blog.udn.com/wong2006/2547710http://tw.myblog.yahoo.com/jun681031-bear/article?mid=5556
Generate the member-page mapping
We can easily find the expression rules and fetch member-id from the URL.
http://www.wretch.cc/blog/ddedogtoootw/9759034http://www.wretch.cc/blog/minyang0925/20688505http://www.wretch.cc/blog/greezydebut/7175512http://www.wretch.cc/blog/ddedogtoootw/http://www.wretch.cc/blog/minyang0925/http://www.wretch.cc/blog/greezydebut/
ddedogtoootwminyang0925greezydebut
Generate the member-friend mapping
What is the definition of friend? My friend? Somebody who set me as his friend? Somebody who has visited my blog? Somebody who has commended my blog? Somebody who has left messages for me? ……
Which definition is suitable for each blog website?
Generate the member-friend mapping
Our definition Somebody whose page-urls are occurred in my art
icles. This relation is usually caused by “reply”.
…http://www.wretch.cc/blog/love6380/20856457 http://www.wretch.cc/blog/oeoehaha/5943390 http://www.wretch.cc/blog/parfaite/15050239 …
http://www.wretch.cc/blog/illyqueen/12364112
source
Conclusion of Solution
For each article, we know its category and author.
For each member (author), we know all the articles he has posted and his friend.
Hence we can calculate R(d).
),(1),(1)()( cmRcmRdRdR fndpsndoc
AVGpsn N
mN
mN
cmNcmR
)(
)(
),(),(
||
),(
),(F
cfR
cmR Ffpsn
fnd
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Evaluation
How to decide if a document is relevant? Feedback from user.
Comparison Rdoc (pure LM)
R (LM + Rpsn + Rfnd)
What are the effect of α and β ?
),(1),(1)()( cmRcmRdRdR fndpsndoc
Outline
Scope Problem Solution Evaluation Conclusion & Future Works
Conclusion
We adopt these information to improve the searching effectiveness.
Category Personal Characteristic Friend relation
We will compare the effectiveness of with and without our method.
Future works
How about consider feed instead of entry? Are there better definition of Personal Chara
cteristic & Friend? Are there better equation of R(Rdoc,Rpsn,Rfnd)?
Thank you
We appreciate your suggestions !