mining and supporting community structures in sensor network research

Download Mining and Supporting Community Structures in Sensor Network Research

Post on 29-Aug-2014

1.770 views

Category:

Technology

0 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • Mining and supporting community structures in sensor network research Alberto Pepe (University of California at Los Angeles) Marko A. Rodriguez (Los Alamos National Laboratory) CENS Friday Seminar | May 2, 2008
  • Outline.
    • Studying Collaboration at CENS
      • Introduction to Data Practices
      • Detection of Structural Communities
      • Data Set and Methods
      • Results
    • Supporting Collaboration at CENS
      • Introduction to the Semantic Web
      • Semantic Networks and Graph Databases
      • Analyzing Semantic Networks
      • Demo
    Alberto Marko
  • Data practices group.
    • Background research questions:
      • What are CENS data?
      • What context data is necessary to support interpretation during re-use?
      • How can we automate the capture of context data?
      • How can we link scholarly and scientific data into meaningful aggregations/chains?
      • What are the social and academic settings that yield the production of scientific and engineering data/knowledge?
  • Current study.
    • Question: how do collaboration communities differ from socioacademic communities?
    • Method : comparative analysis of coauthorship network community structure and selected socioacademic community structures (e.g. academic department, affiliation, country of origin, academic position)
    Rodriguez, M.A., Pepe, A., On the relationship between the structural and socioacademic communities of a coauthorship network, Journal of Informetrics, in press, 2008.
  • Steps of the study.
    • Gather bibliographic and socioacademic data.
    • Generate coauthorship network.
    • Determine structural communities in the coauthorship network.
    • Test for statistical independence between the structural and socioacademic communities.
  • Steps of the study.
    • Gather bibliographic and socioacademic data.
    • Generate coauthorship network.
    • Determine structural communities in the coauthorship network.
    • Test for statistical independence between the structural and socioacademic communities.
  • Gather data.
    • Population data :
      • Collected from eScholarship repository
      • 291 CENS and non-CENS authors
      • Multi-institutional and interdisciplinary
      • 560 manuscripts (379 conference papers, 163 journal articles)
      • Published over a ten year period (1998-2007)
      • Gathered academic department, academic affiliation, country of origin, and academic position
  • Steps of the study.
    • Gather bibliographic and socioacademic data.
    • Generate coauthorship network.
    • Determine structural communities in the coauthorship network.
    • Test for statistical independence between the structural and socioacademic communities.
  • Generate coauthorship network.
    • @article{
    • author={Marko A. Rodriguez and Alberto Pepe },
    • title={On the relationship },
    • journal={Journal of Informetrics },
    • year=2008,
    • editor={Leo Egghe },
    • }
    Alberto Marko coauthor
  • CENS population statistics. Socioacademic communities
  • Study model. Alberto Marko coauthor Affiliation: UCLA Department: IS Origin: Italy Position: PhD Student Affiliation: LANL Department: CS Origin: USA Position: PostDoc
  • Steps of the study.
    • Gather bibliographic and socioacademic data.
    • Generate coauthorship network.
    • Determine structural communities in the coauthorship network.
    • Test for statistical independence between the structural and socioacademic communities.
  • Structural communities.
    • Structural communities are c liquish subgraphs composed by groups of vertices that are highly connected between them, but poorly connected to other vertices.
    Girvan, M., & Newman, M. E. J., Community structure in social and biological networks. Proceedings of the National Academy of Sciences, 99, 7821, 2002.
  • Community detection methods.
    • edge betweenness [1]
    • walktrap (random walks) [2]
    • spinglass [3]
    • leading eigenvector [4]
    [1] Girvan, M., & Newman, M. E. J. Community structure in social and biological networks, Proceedings of the National Academy of Sciences, 99:7821, 2002. [2] Pons, P., & Latapy, M., Computing communities in large networks using random walks, Journal of Graph Algorithms and Applications, 10:2, 2006. [3] Reichardt, J., & Bornholdt, S, Statistical mechanics of community detection, Physical Review E, 74 (016110), 2006. [4] Newman, M. E. J., Finding community structure in networks using the eigenvectors of matrices. Physical Review E, 74, 2006.
  • Coauthorship network map. 27 structural detected CENS communities (LEV).
  • Coauthorship network statistics.
    • Typical clustering coefficients:
    • mathematics: 0.34
    • physics: 0.56
    • biology: 0.60
    • less-cliquish, sparse collaboration patterns
    • CENS community fragmented in research agenda
    • Newman, M. E. J.,The structure and function of complex networks, SIAM Review, 45, 167, 2003.
  • Steps of the study.
    • Gather bibliographic and socioacademic data.
    • Generate coauthorship network.
    • Determine structural communities in the coauthorship network.
    • Test for statistical independence between the structural and socioacademic communities.
  • Chi square test.
    • Chi square test determines whether two nominal/categorical properties are statistically independent.
    Alberto Marko coauthor Community: A Affiliation: UCLA Department: IS Origin: Italy Position: PhD Student Community: B Affiliation: LANL Department: CS Origin: USA Position: PostDoc
  • Chi square analysis. N.B. p-value greater than 0.05 is considered statistically independent leading eigenvector (LEV), walktrap (WT), edge betweenness (EB), spinglass (SG).
  • Anecdotal example.
  • Anecdotal example.
  • Remarks.
    • Findings :
      • Community structure is representative of department and affiliation
      • Academic position and country of origin are independent of the structural community of the scholar.
    • Generalization :
      • Policy recommendations to increase interdisciplinarity
      • Extension to other coauthorship network and other socioacademic (demographic) variables
      • Useful to predict or infer topological/socioacademic configuration when data is scarce
  • Metadata reuse.
    • Metadata can be used to support scholarly collaboration.
  • Everything is metadata. Borgman Article2 JCDL Pepe Italy UCLA CENS writtenBy writtenBy member country attended hasLab Article1 Sensor Networks cites topic researches contains member member
  • Introduction to the Semantic Web.
    • The World Wide Web is used to link documents, where documents are given universal identifiers/locators called URIs (e.g. URL).
      • The structure is machine processable, but the documents/elements are primarily human processable.
    • The Semantic Web is used to link data, where data is given universal identifiers/locators called URIs (e.g. URL).
      • The structure and the data are both human and machine processable.
    T. Berners-Lee, J. Hendler. Publishing on the Semantic Web. Nature, 410(6832):10231024, April 2001.
  • The Uniform Resource Identifier.
    • Resource = Anything.

Recommended

View more >