complex social network mining - tsinghuakeg.cs.tsinghua.edu.cn/jietang/publications/jie... · 3...
Post on 04-Aug-2020
3 Views
Preview:
TRANSCRIPT
1
Complex Social Network Mining —Theory, Methodologies, and Applications
Jie Tang
Department of Computer Science and Technology
Tsinghua University Email: jietang@tsinghua.edu.cn
2
Social Networks
Web 1.0 (1989)
Pages, hyperlinks
Relevance search
Web 2.0 (2004)
social networks
Blogs, micro-blogs
Mobile Web (2008-20)
Connecting via
mobiles…
Web-based (or mobile-based) social networks already
become a bridge to connect our real daily life and the virtual
web space
3
Web-based Social Network Mining —Theory, Methodologies, and Applications
Web1.0: Web of Pages Web 2.0: Web of People Web 3.0: Web of Semantics
Social theory Learning from users
Attribute/link
prediction
Search/query
over networks
Social dynamics
Trust and privacy
Social influence
analysis
Social data
integration
Social
knowledge
acquisition
1
2
3
4
5
6
7
Collective learning Graphical models
Theoretical layer
4
Outline
• ArnetMiner: Academic Social Network
• Core Techniques
– Knowledge Acquisition
– Semantic Integration
– Heterogeneous Ranking
– Social Influence Analysis
• Demo
5
提供全面的研究者网络分析与挖掘功能
Papers published: ACM TKDD, KDD’08-10, SDM’09,
ICDM’07-09, CIKM’07-09, DKE, JIS
http://arnetminer.org/
ArnetMiner.org - Academic research social network analysis and mining system
6
Why Arnetminer.org?
“Academic search is
treated as document
search, but ignore
semantics”
“The information
need is not only
about publication…”
7
Examples – Expertise search
Researcher A
• When starting a
work in a new research topic;
• Or brainstorming for novel
ideas.
• Who are experts in this field?
• What are the top conferences in
the field?
• What are the best papers?
• What are the top research labs?
8
Examples – Citation network analysis
Researcher B • an in-depth understanding
of the research field?
Self-Indexing Inverted Files for
Fast Text Retrieval
Static
Index Pruning for Information
Retrieval Systems
Signature les: An access Method
for Documents and
its Analytical Performance
Evaluation
Filtered
Document Retrieval with
Frequency-Sorted Indexes
Vector-space Ranking with
Effective Early Termination
Efficient Document Retrieval in
Main Memory
A Document-centric Approach
to Static Index Pruning in Text
Retrieval Systems
An Inverted Index
Implementation
Parameterised Compression for
Sparse Bitmaps
Introduction of Modern
Information Retrieval
Memory Efficient
Ranking
Topic 31: Ranking and Inverted Index
Topic 27: Information retrieval
Topic 1 : Theory
Topic 21: Framework
Topic 22: Compression
Other
Topic 23: Index method
Topic 34: Parallel computing
Basic theoryComparable workOther
Citation Relationship Type
Topics
9
Which conference
should we submit the
paper?
Researcher C
authors
content
Examples – Conference Suggestion
10
Who are best matching
reviewers for each
paper? KDD Committee
Paper content
conference
Examples – Reviewer Suggestion
11
Our Social Network is Black Social network without
role/relationship info, e.g. a company’s email network
CEO
Employee
How to
infer Manager
Latent relationship graph
Fortunately, user interactions form implicit groups
12
From BW to Color
13
1
3
2
14
Person Search
Basic Info.
Research Interests
Publications
Social Network
Citation statistics
Fundings
15
Expertise
Search
Finding experts,
expertise conferences,
and expertise papers
for ―information
retrieval‖
16
Course Search
Finding courses for
―data mining‖
17
Association Search
Finding associations
between persons
- high efficiency
- Top-K associations
Usage:
- to find a partner
- to find a person with
same interests
18
Sub-Graph Search
Sub graphs
19
Topic Browser
200 topics have been
discovered automatically
from the academic network
20
Academic Performance Measurement
Academic Statistics
Personal Statistics
21
Outline
• ArnetMiner: Academic Social Network
• Core Techniques
– Knowledge Acquisition
– Semantic Integration
– Heterogeneous Ranking
– Social Influence Analysis
• Demo
22
Homepage
Papers
ACM
Papers
DBLP
Papers
Libra
Data Sources
Integration
Social Network Extraction
Name Disambiguation
Profiling extraction
Homepage finding
Publication extraction
citation
Scholar
Social Network Storage
Storage
Indexing
Access interface
RNKB
Metadata
Modeling and Search Network
T
DNd
wzxad
β
Φ
α
A
θ
c
T
μ ψ
Social Network AnalysisM
ap
-redu
ce--Distrib
uted
pro
cessing p
latfo
rm
Topic model
Academic
suggestion Expertise search
Social involution
analysis
Citation tracing
analysis
Social influence
analysis
ArnetMiner: Overview
1
2 3
writewrite
cite
cite
cite
write
write
write
cite
Write
publish
publish
publish
publish
publish
publish
write
write
coauthor coauthor
Dr. Tang
Limin
Prof. Wang
Prof. Li
SVM...Association...
Tree CRF...
Semantic...EOS... Annotation...
IJCAI
ISWC
WWW
Pc member
23
Ruud Bolle Office: 1S-D58
Letters: IBM T.J. Watson Research Center
P.O. Box 704
Yorktown Heights, NY 10598 USA
Packages: IBM T.J. Watson Research Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Email: bolle@us.ibm.com
Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's
Degree in Analog Electronics in 1977 and the Master's Degree in Electrical
Engineering in 1980, both from Delft University of Technology, Delft, The
Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in
1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode
Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Department. In 1988 he became manager of the newly formed Exploratory Computer
Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video
processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
DBLP: Ruud Bolle
2006
Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics:
A Case Study in Fingerprints. ICPR (4) 2006: 370-373EE50
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint
Representation Using Localized Texture Features. ICPR (4) 2006: 521-524EE49
Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle:
Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)EE48
2005
Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior:
The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20EE47
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha:
Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116EE46
...
Ruud Bolle Office: 1S-D58
Letters: IBM T.J. Watson Research Center
P.O. Box 704
Yorktown Heights, NY 10598 USA
Packages: IBM T.J. Watson Research Center
19 Skyline Drive
Hawthorne, NY 10532 USA
Email: bolle@us.ibm.com
Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's
Degree in Analog Electronics in 1977 and the Master's Degree in Electrical
Engineering in 1980, both from Delft University of Technology, Delft, The
Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in
1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode
Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson
Research Center in the Artificial Intelligence Department of the Computer Science
Department. In 1988 he became manager of the newly formed Exploratory Computer
Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video
processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer
Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud
M. Bolle is a Member of the IBM Academy of Technology.
CT1: Knowledge Acquisition from Social Web (ACM TKDD, ISWC’06, ICDM’07, ACL’07, CIKM’07-08)
Contact Information
Educational history
Academic services
Publications
1
1
2
2
Ruud Bolle
Position
Affiliation
Address
Address
Phduniv
Phdmajor
Phddate
Msuniv
Msdate
Msmajor
BsunivBsdate
Bsmajor
Research Staff
IBM T.J. Watson Research
Center
P.O. Box 704
Yorktown Heights,
NY 10598 USA
bolle@us.ibm.com
Brown University
1984
Electrical Engineering
Delft University of Technology
Analog Electronics
1977
Delft University of Technology
IBM T.J. Watson Research
Center
19 Skyline Drive
Hawthorne, NY 10532 USA
IBM T.J. Watson
Research Center
Electrical Engineering
1980
Applied Mathematics
Msmajor
http://researchweb.watson.ibm.com/
ecvg/people/bolle.html
Homepage
Ruud BolleName
video database indexing
video processing
visual human-computer interaction
biometrics applications
Research_Interest
Photo
Publication 1#
Cancelable Biometrics:
A Case Study in
Fingerprints
ICPR 370
2006
Date
Start_page
Venue
Title
373
End_page
Publication 2#
Fingerprint
Representation Using
Localized Texture
Features
ICPR 521
2006
Date
Start_page
Venue
Title
524
End_page
. . .
Co-authorCo-author
1
Ruud Bolle
2
Publication #3
Publication #5
coauthor
coauthor
UIUC affiliation
Professor position
2
1
Two questions: • How to accurately extract the researcher
profile information from the Web?
• How to integrate the information from different
sources?
24
Researcher Network Extraction
Researcher
Homepage
Phone
Address
Phduniv
Phddate
PhdmajorMsuniv
Bsmajor
Bsdate
Bsuniv
Affiliation
Postion
Msmajor
Msdate
Fax
Person Photo
Publication
Research_Interest
NameAuthored
Title
Publication_venue
Start_page
End_page
Date
Coauthor
70.60% of the researchers
have at least one homepage
or an introducing page
There are a large number of
person names having the
ambiguity problem
85.6% from
universities
14.4% from
companies
71.9% are
homepages
28.1% are
introducing
pages
60% are natural
language text
40% are in lists
and tables
70% moved at least one time
Even 3 ―Yi Li‖ graduated from
the author’s lab
25
Our Approach Picture – based on Markov Random Field
aYbY
cY
eY
fY
dY
( | | )
( | | ~ )
i j j i
i j j i
P Y Y Y Y
P Y Y Y Y
Special cases: - Conditional Random Fields
- Hidden Markov Random
Fields
Markov Property:
x1
x2
x3
co-conference
cite
coauthor
cite
coauthor
coauthor
coauthort-coauthor
cite
co-conference
x4
x5
x6
x7
x8
x9
x10
x11
y1=1
y8=1
y9=3
y11=3
y10=3
y2=1
y3=1
y4=2
y7=2
y6=2
y5=2
Researcher Profiling Name Disambiguation
26
CT2: Semantic Integration (IEEE TKDE, SIGMOD’09, IJCAI’09, ISWC’09)
27
RiMOM-A Tool for Semantic Integration (OAEI’06-09)
0
0.5
1
Benchmark Results
Precsion
Recall
F-measure 0
0.2
0.4
0.6
0.8
1 Anatomy Results
Precision
Recall
Recall+
F-measure
0 0.2 0.4 0.6 0.8
1
agrafsa Subtrack Results
Precision http://keg.cs.tsinghua.edu.cn/project/RiMOM/
“I’m really surprised by the
good results of these years
RiMOM, you can compete with
the top systems that make use
of such background knowledge.‖
28
CT3: Topic-based Heterogeneous Ranking (Machine Learn. J, KDD’08, ICDM’08, CIKM’09, DKE)
Search with
keyword
Modeling using VSM Principles of Data Mining. DJ Hand - Drug Safety, 2007 - drugsafety.adisonline.com
Advances in Knowledge Discovery and Data Mining UM Fayyad, G Piatetsky-Shapiro, P Smyth, R…
Data Mining: Concepts and Techniques J Han, M Kamber - 2001…
Return
Search with
semantic
modeling
Modeling using semantic topics
Data
mining
Data mining
Association Rules
Database systems
Data management
Web databases
Information systems
0.4
0.2
0.15 0.1
0.05
0.02
Topics
Return
Experts Expertise
conferences
Expertise
papers
Data
mining
11
00
1 1 0 1
1 0 1 0 1
0 1
001
11
11
Query
vector
Doc1
vector
Doc3
vector
Doc4 vector
29
1. How to model the
heterogeneous academic
network?
2. How to capture the link
information for ranking
objects in the academic
network?
Challenges
-------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
Cite------------------------
------------------------
------------------------
------------------------
------------------------
Cite
Cite
Citewrite
write
write
Co-write
Co-writeCo-author
Co-author
PC member
chair
publish
publish
publish
-------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
Cite------------------------
------------------------
------------------------
------------------------
------------------------
Cite
Cite
Citewrite
write
write
Co-write
Co-writeCo-author
Co-author
PC member
chair
publish
publish
publish
-------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
Cite------------------------
------------------------
------------------------
------------------------
------------------------
Cite
Cite
Citewrite
write
write
Co-write
Co-writeCo-author
Co-author
PC member
chair
publish
publish
publish
30
Modeling the Academic Network
T
DNd
wzxad
β
Φ
α
A
θ
c
T
μ ψ
T
DNd
wzx
ad
β
Φ
α
AC
θ
c
T
D
Nd
wz
β
Φ
c
η,σ2
ad x
α
A
θ
ACT1 ACT2 ACT3
authors
Topic
words
conference
Author-Conference-Topic Model [Tang et al., 08]
31
Random walk over the
academic network Modeling academic
network with topics
Integrating Topic Model into Random Walk
-------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
------------------------
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
Cite------------------------
------------------------
------------------------
------------------------
------------------------
Cite
Cite
Citewrite
write
write
Co-write
Co-writeCo-author
Co-author
PC member
chair
publish
publish
publish+
Author-Conference-Topic
Model [Tang et al., 08]
32
Combination Method 1
ISWC
IJCAI
WWW
Tree CRF...
EOS...
Association...
Paper Graph Gp
Author Graph Ge
Prof.
WangProf. Tang
Jing Zhang
Conference
Graph Gc
λde
λed
λcd
λdc
λdd
Stage 1:
Random walk
Stage 2.
Topic-based
relevance
Ranking score
Topic-based
relevance score
Combination by
multiplication
ISWC
IJCAI
WWW
Tree CRF...
EOS...
Association...
Prof.
WangProf. Tang
Jing Zhang
Data
mining
Query
...
...
Topic layer
33
Query:
ontology alignment
ISWC
IJCAI
WWW
Tree CRF...
EOS...
Association...
posowl
Web
service
Paper Graph Gp
Author Graph Ge
Prof.
WangProf. Tang
Jing Zhang
Conference
Graph Gc
Hidden Theme
Graph Gt
λde
λed
λcd
λdc
λtdλdt
λqtλtq
λdd
Combination Method 2
Ranking score
Transition probability
34
Learning to Rank Experts
• Combining more information Empirical loss Model penalty
2
1
2
21 ,min
i i iT
n
a b
TTw
T T T
i
wz w x x
feature weight
Language model,
BM25, tf*idf
35
Heterogeneous Cross-domain Ranking
KDD
SDM
ICDM
PAKDD
?
P. Yu
?
Principles of Data Mining
Data Mining: Concepts and
Techniques
?
Conferences
Papers
Authors
?
?
Query: “data mining”
conf author/
paper
1 2
1
2
1, 2,1
mi 1 ,n , 1i i i i
S Ti i
n n
a b a b
S S S S T T T Tw w
i i
C Wz w x x z w x x
Loss in one domain Loss in another domain
1 2
1 12, ,1,
21 , 1 ,min
i i iS T
i i i
n n
T a b T a b
S S S S T T T
i iw w
TU
z w U x x z w xC U Wx
Common feature space
36
Learning Algorithm
• Equivalent objective function:
Optimize the loss function for
each domain
Common space discovery
Optimize the weight
via the common space
37
Experimental Results
• Data sets
– Homogeneous Data
• LETOR 2.0: TREC2003, TREC2004, and OHSUMED
– Heterogeneous Data
• Academic network consisting of 14,134 authors, 10,716
papers, and 1,434 conferences.
– Heterogeneous Tasks
• Expert finding vs. Bole search
• Baselines
– RSVM
– Language model
38
Results on Homogeneous Data
39
Results on Heterogeneous Data
40
Results on Heterogeneous Tasks
• Expert finding verse Bole search (finding best supervisor)
• To obtain ground truth of bole for each query
– We sent emails to 50 senior researchers and 50 junior researchers
(91.6% are post doc or graduates)
– Average their feedbacks
41
CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS)
• How to quantify the influence between users?
• What is the relationship between users?
• How to discover topic distribution over links?
• Can we predict the user’s actions? write
write
cite
cite
cite
write
write
write
cite
Write
publish
publish
publish
publish
publish
publish
write
write
coauthor coauthor
Dr. Tang
Limin
Prof. Wang
Prof. Li
SVM...Association...
Tree CRF...
Semantic...EOS... Annotation...
IJCAI
ISWC
WWW
Pc member
42
Topic-based Social Influence Analysis
• Social network -> Topical influence network
43
Social Influence Sub-graph on ―Data mining‖
44
Influential nodes on different topics
45
CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS)
• How to quantify the influence between users?
• What is the relationship between users?
• How to discover topic distribution over links?
• Can we predict the user’s actions? write
write
cite
cite
cite
write
write
write
cite
Write
publish
publish
publish
publish
publish
publish
write
write
coauthor coauthor
Dr. Tang
Limin
Prof. Wang
Prof. Li
SVM...Association...
Tree CRF...
Semantic...EOS... Annotation...
IJCAI
ISWC
WWW
Pc member
46
Mining Advisor-Advisee Relationship
from Research Publication Networks
Sm ithth
2000
2000
2001
2002
2003
22000000000000000000
1999
A da B ob
Jerry
Y ing
Input:T em poralcollaboration netw ork
O utput:R elationship analysis
(0.8,[1999,2000])
(0.7,[2000,2001])
(0.65,[2002,2004])
2004
A da
Bob
Y ing
Sm ith
(0.2,[2001,2003])
(0.5,[/,2000])
(0.9,[/,1998])
(0.4,[/,1998])
(0.49,[/,1999])
V isualized chorologicalhierarchies
Jerry
47
Results
48
Application: visualization
TPFG
RULE
49
Bole Search In Arnetminer
An example on a real
system: Arnetminer
Performance improvement
50
Results (cont.)
51
CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS)
• How to quantify the influence between users?
• What is the relationship between users?
• How to discover topic distribution over links?
• Can we predict the user’s actions? write
write
cite
cite
cite
write
write
write
cite
Write
publish
publish
publish
publish
publish
publish
write
write
coauthor coauthor
Dr. Tang
Limin
Prof. Wang
Prof. Li
SVM...Association...
Tree CRF...
Semantic...EOS... Annotation...
IJCAI
ISWC
WWW
Pc member
52
Original citation network Semantic citation network
Examples – Topic distribution analysis over citations
Researcher A • an in-depth understanding
of the research field?
VS.
Self-Indexing Inverted Files for
Fast Text Retrieval
Static
Index Pruning for Information
Retrieval Systems
Signature les: An access Method
for Documents and
its Analytical Performance
Evaluation
Filtered
Document Retrieval with
Frequency-Sorted Indexes
Vector-space Ranking with
Effective Early Termination
Efficient Document Retrieval in
Main Memory
A Document-centric Approach
to Static Index Pruning in Text
Retrieval Systems
An Inverted Index
Implementation
Parameterised Compression for
Sparse Bitmaps
Introduction of Modern
Information Retrieval
Memory Efficient
Ranking
Topic 31: Ranking and Inverted Index
Topic 27: Information retrieval
Topic 1 : Theory
Topic 21: Framework
Topic 22: Compression
Other
Topic 23: Index method
Topic 34: Parallel computing
Basic theoryComparable workOther
Citation Relationship Type
Topics
53
Problem: Link Semantic Analysis Topic modeling
over links Citation context
words
Link semantics
54
Pairwise Restricted Boltzmann Machines
(PRBMs)
Link context
words
Topic distribution
Link category
Latent variables
defined over the
link to bridge the
two pages
Pairwise Restricted Boltzmann
Machines (PRBMs) Example
55
Accuracy of Link Categorization
gPRBM: our approach
with generative
learning
dPRBM: our approach
with discriminative
learning
hPRBM: our approach
with hybrid learning
Tested on Arnetminer
citation data
56
CT4: Social Influence Analysis (KDD’10, KDD’10, KDD’09, ICDM’09, JIS)
• How to quantify the influence between users?
• What is the relationship between users?
• How to discover topic distribution over links?
• Can we predict the user’s actions? write
write
cite
cite
cite
write
write
write
cite
Write
publish
publish
publish
publish
publish
publish
write
write
coauthor coauthor
Dr. Tang
Limin
Prof. Wang
Prof. Li
SVM...Association...
Tree CRF...
Semantic...EOS... Annotation...
IJCAI
ISWC
WWW
Pc member
57
What can we do in SNS?
58
Social Action
Add favorites
Comment on Haiti
Earthquake
Publish in KDD
Conference
Twitter Flickr KDD
59
Action
1. Always watch news
2. Enjoy sports
3. … Attribute
Comment on Haiti
Earthquake
Time t Time t+1
60
Results
61
Arnetminer Today — A brief summary
62
• 2006/5, V0.1 Perl-based CGI version
– Profile extraction, person/paper/conf. search
• 2006/8, V1.0 Java (Demo @ ASWC)
– Rewrite the above functions
• 2007/7, V2.0 (Demo @ KDD, ISWC)
– New: survey search, research interest, association search
• 2008/4, V3.0 (Demo @ WWW)
– Query understanding, New search GUI, log analysis
• 2008/11, V4.0 (Demo @ KDD, ICDM)
– Graph search, topic mining, NSFC/NSF
• 2009/4, V5.0 (Demo @ KDD)
– Bole/course search, profile editing, open resources, #citation
• 2009/12, V6.0
– Academic statistics, user feedbacks, refined ranking
• V7.0, coming soon
– Name disambiguation, reviewer assignment, supervisor suggestion, open API
ArnetMiner’s History
63
* Arnetminer data:
> 0.6 M researcher profiles
> 3M papers
> 17M citation relationships
> 5K conferences
> 50M logs
* Visits come from more than 190
countries
* Continuously +20% increase of
visits per month
* >100,000 page views per day
ArnetMiner Today
Top 10 countries
1. USA 6. Canada
2. China 7. Japan
3. Germany 8. Taiwan
4. India 9. France
5. UK 10. Italy
… I’ve happened to visit your Arnetminer, and
shocked. It was really impressive, its usefulness
and your works!!! … [from …@selab.snu.ac.kr]
…I would first of all congratulate you on the
excellent work you have done in Arnetminer and I
am much inspired… [from …@nu.edu.pk]
… Arnetminer is one of my favorite tools to find
folk and academic relatives… [from …@qlink.com]
Dear Dr. Jie Tang,
Can you include our papers (http://www.waset.org)
in your Arnetminer?... [from …@waset.org]
.. top top! I am very interested in your ArnetMiner.
Is that possible give me a bit of your social
network data… [from …@cse.ust.hk]
Messages from Users
Title: Semantic Technologies for Learning and Teaching in Web 2.0. — Thanassis Tiropanis, Hugh Davis, Dave Millard, Mark Weal
…Exposing the expertise of the institution to the outside world in order to attract
funding and students. ArnetMiner is the most representative example of such tools
at the moment…
Contextualised queries and searches, searches across repositories potentially in
different departments or institutions, and matching of people for collaborative
activities. Best example of the surveyed technologies to this end is ArnetMiner.
a survey by UK Southampton
64
Opportunity: exploiting semantic web and social network
in the real-world
Social search
& mining
Social network
Extraction
Social network
Mining
IBM
Scientific
Literature
Users cover >180
countries
>600K researcher
>3M papers
Arnetminer.org
Advertisement
Advertisement
Recommendation
Sohu
Mobile Context
Mobile search
& recommendation
Nokia
Large-scale
Mining
Scalable algorithms
for message tagging
and community
Discovery
Energy trend
analysis
Energy product
Evolution
Techniques
Trend
Oil Company
Search, browsing, complex query, integration, collaboration, trustable
analysis, decision support, intelligent services,
Web, relational data,
ontological data,
social data
Data Mining and Social Network techniques
科技信息资源内容监测与分析服务平台 (中国科技部信息情报研究所)
65
Arnetminer
PatentMiner CheMiner PubmedMiner ScopusMiner
66
Representative Publications • Jie Tang, Jing Zhang, Ruoming Jin, Zi Yang, Keke Cai, Li Zhang, and Zhong Su. Topic Level Expertise
Search over Heterogeneous Networks. Machine Learning Journal.
• Jie Tang, Limin Yao, Duo Zhang, and Jing Zhang. A Combination Approach to Web User Profiling. ACM
TKDD, 2010.
• Juanzi Li, Jie Tang, Yi Li, Qiong Luo. RiMOM: A Dynamic Multi-Strategy Ontology Alignment Framework.
IEEE TKDE, 2009.
• Chenhao Tan, Jie Tang, Jimeng Sun, Quan Lin, and Fengjiao Wang. Social Action Tracking via Noise
Tolerant Time-varying Factor Graphs. KDD’10.
• Chi Wang, Jiawei Han, Yuntao Jia, Duo Zhang, Yintao Yu, Jie Tang, Jingyi Guo. Mining Advisor-Advisee
Relationships from Research Publication Networks. KDD’10.
• Jie Tang, Jimeng Sun, Chi Wang, and Zi Yang. Social Influence Analysis in Large-scale Networks. KDD'09.
• Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, and Zhong Su. ArnetMiner: Extraction and Mining of
Academic Social Networks. KDD’08.
• Jie Tang, Hang Li, Yunbo Cao, and Zhaohui Tang. Email Data Cleaning. KDD’05.
• Jie Tang, Ho-fung Leung, Qiong Luo, Dewei Chen, and Jibin Gong. Towards Ontology Learning from
Folksonomies. IJCAI’09.
• Qian Zhong, Hanyu Li, Juanzi Li, Guotong Xie, Jie Tang, Lizhu Zhou. A Gauss Function based Approach for
Unbalanced Ontology Matching. SIGMOD’09.
• Feng Shi, Juanzi Li, Jie Tang. Actively Learning Ontology Matching via User Interaction. ISWC’09.
• Chonghui Zhu, Jie Tang, Hang Li, Hwee Tou Ng, and Tiejun Zhao. A Unified Tagging Approach to Text
Normalization. ACL’07.
Others: ICDM’07-09, CIKM’07-09, SDM’09, ISWC’06, DKE, JIS, etc.
67
Demo: http://arnetminer.org
HP: http://keg.cs.tsinghua.edu.cn/persons/tj/
Thanks!
top related