large-scale networks - redbelly blockchain · large-scale networks 1a-introduction school of...
TRANSCRIPT
The University of Sydney Page 1
Large-Scale Networks
1a-Introduction
School of Information Technologies Dr Vincent Gramoli | Lecturer
What is a Network?
2
The University of Sydney Page 3
Why you should care about networks
– Universal language for describing complex data – Networks from science, nature, and technology are more similar than one
would expect
– Shared vocabulary between fields – Computer science, social science, physics, economics, statistics, biology
– Data availability (/computational challenges) – Web/mobile, bio, health, and medical
3
The University of Sydney Page 4
Air-traffic
4
US air-traffic graph
The University of Sydney Page 5
5
Facebook social graph - 4-degrees of separation [Backstrom-Boldi-Rosa-Ugander-Vigna, 2011]
The University of Sydney Page 6
Epidemics
6
Spread of tuberculosis disease
The University of Sydney Page 7
Terrorist network
7
9/11 terrorist network [Krebs, 2002]
The University of Sydney Page 8
Blog Web pages
8
Political blogs prior to the 2004 U.S. Presidential election
Administration
9
The University of Sydney Page 10
Timetable
– Title: Large-Scale Network – UOS code: COMP5313 – Credit points: 6 – Lecture:
– Thursdays 18h-20h, weeks 1-13
– Lab: – Thursdays 20h-21h, weeks 2-10 (11-12)
The University of Sydney Page 11
Myself
– Vincent Gramoli – Office: 417 – Phone: (02) 903 69270 – [email protected] – http://sydney.edu.au/engineering/it/~gramoli/ – Office hour: Thursday 14h-15h
– Background – Distributed computing – USA, France, Switzerland
The University of Sydney Page 12
eLearning
– Course information:
http://poseidon.it.usyd.edu.au/~gramoli/largescalenetworks/php/comp5313.php
– All assessments are communicated through the blackboard website: https://elearning.sydney.edu.au/ and piazza.
– Do not send email to the lecturer, but post questions online on piazza https://piazza.com/sydney.edu.au/
– Send proof (certificate) within the next 5 days (special consideration)
and you will either be excused or you will be evaluated through an alternative mark assessment
The University of Sydney Page 13
eLearning
– Slides will be uploaded right before the lecture
– Video recording should be automatically posted on blackboard
Material
14
The University of Sydney Page 15
Materials
– Required textbook – Freely available online at
http://www.cs.cornell.edu/home/kleinber/networks-book/
15
Textbook: Networks, Crowds and Markets. Easley and Kleinberg
The University of Sydney Page 16
Materials
16
Related books (optional)
• Albert-Laszlo Barabasi. Linked: The New Science of Networks. • Duncan Watts. Six Degrees: The Science of a Connected Age. 2003. • Mark Buchanan. Nexus: Small Worlds and the Groundbreaking Theory of Networks. 2003. • Laszlo Barabasi. Network Science. 2015
The University of Sydney Page 17
Materials
17
Other books related to defence (optional)
…and to programming (optional)
Roadmap
18
The University of Sydney Page 19
Roadmap
Click here: http://poseidon.it.usyd.edu.au/~gramoli/largescalenetworks/php/comp5313.php
Objectives
20
The University of Sydney Page 21
Course description
– The growing connectedness of modern society translates into simplifying global communication and accelerating spread of news, information and epidemics. The focus of this unit is on the key concepts to address the challenges induced by the recent scale shift of complex networks. In particular, the course will present how scalable solutions exploiting graph theory, sociology, game theory and probability tackle the problems of communicating (routing, diffusing, aggregating) in dynamic and social networks.
The University of Sydney Page 22
Goal
– Understanding – Network structures – Strategic behaviors – Epidemic behaviors
– Networks were studied in many fields – Computer science – Applied mathematics – Sociology
– A synthesis is thus necessary – Complexity of network structures – Information – System with interacting agents
› We cover:
- Graph theory
- Information networks
- Network dynamics
- Basic Python programming
› We do not cover:
- Game theory
- Economics
- Markets
The University of Sydney Page 23
Learning outcomes
– Information Seeking – 1. Students understand and can quantify accurately the role of network in communication
exchanges. – 2. Students understand the technical issues that affect the dissemination of information in a
network. – 3. Students can analyse probabilistically the relations between communicating entities of a
network. – 4. Students know the key factors that impact the accuracy and speed of information
dissemination and aggregation. – Maths/Science Methods and Tools
– 5. Students understand the asymptotic complexity and accuracy of graph algorithms. – 6. Students know the stochastic methods necessary to evaluate the convergence of various
algorithms. – 7. Students recognise probabilistic solutions to problems that have no deterministic solutions
and apply them thoroughly. – Engineering/IT Specialisation
– 8. Students have skills to compare experimentally and theoretically the adequacy of different probabilistic solutions.
– 9. Students are familiar with various types of network models in different contexts like computer science, society or markets
– 10. Students understand the fundamental structures, dynamics and resource distribution in such models.
The University of Sydney Page 24
Assessments
– Assignment 1 – Solving problems [20%; due week 6]. – Covers outcomes 2, 3, 4, 5, 6 and 7.
– Midterm online Quiz
– Answering multiple choice questions in labs [20%; due week 7]. – Covers outcomes 1, 2, 3, 4, 5, 7 and 10.
– Assignment 2:
– Either one of these two tasks: (1) Writing a short (4-6p) research paper exploring a research topic related to the course in LaTeX and presenting the related work and an analysis of this topic; or (2) Programming an algorithm related to the in C/C++, Java or Python and making a demo of it.
– [20%; due week 10]. Covers outcomes 5, 7, 8, 9 and 10.
– Final exam [40%; due exam period] Covers all outcomes.
The University of Sydney Page 25
Assessments
– Suggestions for assignment 2
– Programming assignment examples (suggest your own as early as possible): – Computing the PageRank on a network, analyzing the result, concluding – Extracting properties from a network (clustering coefficient, degree, diameter), analyzing
the result, concluding – Building a new network, based on Massively Open Online Courses (MOOC) and linking
them with pre-requisites links, analyzing, concluding
– Literature assignment examples (suggest your own by as early as possible): – Gossip-Based Computation of Aggregate Information. Kempe, Dobra, Gherke. – Google's PageRank and Beyond: The Science of Search Engine Rankings. Amy N. Langville
& Carl D. Meyer. ISBN:9780691152660. 2012. – Eugene Garfield. Citation analysis as a tool in journal evaluation. Science, 178:471-479,
1972. – Attitudes and Cognitive Organizations. F. Heider 1946. The Journal of Psychology
21:107-112, 1946 – Palmer, Gibbons, Faloutos. ANF: A fast and scalable tool for data mining in massive
graphs. Proc. Of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 81-90, 2002.
– David R. Karger, Matthias Ruhl. Finding nearest neighbors in growth-restricted metrics. STOC 2002: 741-750
Topics
26
The University of Sydney Page 27
Graph theory
– The communication can be balanced between staying within small organizational units and cutting across organizational boundaries
– Strong ties, representing close and frequent social contacts, tend to be embedded in tightly-linked regions of the network
– Weak ties, representing more casual and distinct social contacts, tend to cross between these regions
– At a global scale, it suggests some of the ways in which weak ties can act as short-cuts that link together distant parts of the world, resulting in the phenomenon colloquially known as the six degrees of separation.
27
The University of Sydney Page 28
Graph theory
28
E-mail communication among 436 employees of HP Lab
The University of Sydney Page 29
Structural balance
– Social networks can also capture the sources of conflict within a group
– People may tend to be connected to many others without being connected together
– Non-interacting clusters may be symptomatic of a conflicts between people
– Structural balance can be used to reason about how fissures in a network may arise from the dynamics of conflict and antagonism at a local level
29
The University of Sydney Page 30
Structural balance
30
Two rival karate clubs represented by different colors
Instructor Student founder
The University of Sydney Page 31
Network cascades
– A cascading behavior spreads from one person to another – Like a biological epidemics
– Called “social contagion”
– Cascading effects may arise from individuals with incentive to adopt the behavior of their neighbors in the network – A new behavior starts with a small set of initial adopters
– Then spreads radially outward through the network
31
The University of Sydney Page 32
Network cascades
32
Ebola 2014 Crisis
The University of Sydney Page 33
Network cascades
33
Ebola 2014 Crisis
Source: Washington Post
The University of Sydney Page 34
Network cascades
34
E-mail recommendations for a particular Japanese graphic novel spread outward from four initial purchasers [Leskovec et al. The dynamics of viral marketing. ACM Trans. on the Web ‘07]
The University of Sydney Page 35
Applications
– When understood, the properties of networks can be applied elsewhere
– News spreading: Six degrees of separation is useful to disseminate information rapidly
– File lookup: Selecting shortcuts adequately can help navigating rapidly to a destination
35
The University of Sydney Page 36
Applications
36
A gossip-based protocols allows to rapidly spread data to a large number of computers of the same network
1.
2.
3.
4.
time
The University of Sydney Page 37
Applications
37
The overlay networks use shortcuts to effectively retrieve files in a peer-to-peer file sharing system
Why Taking this Course?
38
The University of Sydney Page 39
Why taking this course?
– Networks are everywhere – Public transportation – Social media – Protein interactions
– Give a better understanding of networks – Property identification (low diameter, high clustering coefficient…) – Characterization (scale-free, power-law distribution…)
– Large impact – Marketing: what individuals to target to maximize product adoption? – Defense: how to locate a terrorist? – Health: how to identify patient 0 and understand epidemics
propagation?
The University of Sydney Page 40
Why taking this course?
– Age and size of social networks
40
Source: CS224W Stanford University - Jure Leskovec.