cs5234 combinatorial and graph algorithmsgilbert/cs5234/2017/lectures/01... · a modern twist on...
TRANSCRIPT
![Page 1: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/1.jpg)
CS5234Combinatorial and Graph Algorithms
Welcome!
![Page 2: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/2.jpg)
CS5234 Overview
q Webpage:http://www.comp.nus.edu.sg/~gilbert/CS5234
q Instructor: Seth GilbertOffice: COM2-323Office hours: by appointment
![Page 3: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/3.jpg)
Why are we here?
![Page 4: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/4.jpg)
Combinatorial and Graph Algorithms:
Why are we here?
![Page 5: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/5.jpg)
Combinatorial and BIG Graph Algorithms:
A modern twist on classic problems…
Why are we here?
![Page 6: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/6.jpg)
What happens when we have a graph containing 4 billion nodes and 1.2 trillion edges?
(The facebook graph, for example, is at least that big.)
Motivating question
![Page 7: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/7.jpg)
Assume a graph of size 1TB.
• Disk scan: 200 MB/s è 83 minutes• Disk seek: 1 MB/s è 11.5 days
Cost of simple Breadth-First-Search?
(Organize your data wisely.)
Some numbers
![Page 8: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/8.jpg)
Scale
• How do we deal with graphs that are big?• Cannot store entire graph in memory.• Processing time is large!
New Challenges
![Page 9: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/9.jpg)
Where is the data?
• Data is no longer as easily accessible.• Is data distributed?• Is data streaming?• Is data noisy?
New Challenges
![Page 10: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/10.jpg)
Dynamic world
• Data is no longer static.• Graphs change over time.• Edges may be added and removed.• Users may come and go.
New Challenges
![Page 11: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/11.jpg)
Context matters
• Where did the data come from?• Is it from a social network?• Is it from a wireless network?• Is it from a game?• How can we leverage the structure to do
better?
New Challenges
![Page 12: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/12.jpg)
Algorithms 101
• Kruskal’s Algorithm• Prim’s Algorithm
• Runs in O(m log n) time for n nodes and m edges.
• Fast enough?
Example: Minimum Spanning Tree
![Page 13: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/13.jpg)
Special Structure
• Is graph planar? • Then we can find an MST in O(m) time.• Is the graph a social network?
Example: Minimum Spanning Tree
![Page 14: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/14.jpg)
Randomization and Approximation
• Can we find a faster randomized algorithm?• Approximate MSG?• Estimate weight of MST?
Amazingly: O(dW log(dW)) for a graph with degree d and max. edge weight WNo dependence on n!!
Example: Minimum Spanning Tree
![Page 15: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/15.jpg)
Streaming
• What if we only have limited access to data?• We get to read each edge once in some
arbitrary order: e1, e2, e3, …, em
• We can’t store the whole graph!• Output an (approximate) MST?
Example: Minimum Spanning Tree
![Page 16: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/16.jpg)
Dynamic
• What if edges change over time?• Edges are continually added and removed
from our graph.• After each change, find a new MST.
Example: Minimum Spanning Tree
![Page 17: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/17.jpg)
Caching
• Caching performance is critical.• Each time we access part of the graph, a
block of memory is loaded. – Expensive!
• How can we design an algorithm for finding an MST that uses cache efficiently?
Example: Minimum Spanning Tree
![Page 18: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/18.jpg)
Parallel/GPU/Distributed
• Can we leverage a multicore machine to find an MST faster?
• Can we use GPUs to get faster performance?• Can we use a distributed cluster (e.g.,
MapReduce/Hadoop) to find an MST faster?
Example: Minimum Spanning Tree
![Page 19: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/19.jpg)
Explore a set of tools for answering these questions.
Goal
![Page 20: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/20.jpg)
“If you need your software to run twice as fast, hire better programmers.
But if you need your software to run more than twice as fast, use a better algorithm.”
-- Software Lead at Microsoft
![Page 21: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/21.jpg)
Explore a set of useful tools for answering these questions.
See a bunch of neat algorithms.
Goal
![Page 22: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/22.jpg)
“... pleasure has probably been the main goal all along.
But I hesitate to admit it, because computer scientists want to maintain their image as hard-working individuals who deserve high salaries... ”
-- D. E. Knuth
![Page 23: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/23.jpg)
Comabinatorial and (BIG) Graph Algorithms
![Page 24: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/24.jpg)
Target students:– Advanced (3rd or 4th year) undergraduates– Master’s students– PhD students– Interested in algorithms– Interested in tools for solving hard problems
Prerequisites: – CS3230 (Analysis of Algorithms)– Mathematical fundamentals
CS5234 : Combinatorial and Graph Algorithms
![Page 25: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/25.jpg)
This is a class about algorithms.
![Page 26: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/26.jpg)
This is a class about algorithms.
expected value P=NP
![Page 27: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/27.jpg)
This is a class about algorithms.
The goal is to deeply understand the algorithms we are studying.
How do they work?
Why do they work?
What are the underlying techniques?
What are the trade-offs?
How do you implement them?
![Page 28: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/28.jpg)
q Mid-term examOctober 12 In class, Week 8
q Final examDecember 5 (please check the official schedule)
CS5234 Overview
![Page 29: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/29.jpg)
q LectureThursday 6:30-8:30pm
q Extra timeThursday 8:30-9:30pm
Extra time will be used for discussion, reviewing problem sets, answering questions, solving riddles, doing crossword puzzles, eating cookies, etc.
CS5234 Overview
![Page 30: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/30.jpg)
q Grading40% Problem sets
25% Mid-term exam
35% Final exam
q Problem sets– 5-6 sets (roughly every week)
– Focused on algorithm design and analysis.
– Perhaps a few will require coding.
CS5234 Overview
![Page 31: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/31.jpg)
q Mini-ProjectSmall project
Idea: put together some of the different ideas we have used in the class.
Time scale: last 4 weeks of the semester.
CS5234 Overview
![Page 32: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/32.jpg)
Survey: Google form.
On the web page.
What is your background?
Not more than 10 minutes.
PS1: Released tomorrow.
CS5234 Overview
![Page 33: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/33.jpg)
q Problem set gradingSimple scheme:
3 : excellent, perfect answer
2 : satisfactory, mostly right
1 : many mistakes / poorly written
0 : mostly wrong / not handed in
-1 : utter nonsense
CS5234 Overview
![Page 34: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/34.jpg)
q What to submit:Concise and precise answers:
Solutions should be rigorous, containing all necessary detail, but no more.
Algorithm descriptions consist of: 1. Summary of results/claims.2. Description of algorithm in English.3. Pseudocode, if helpful.4. Worked example of algorithm.5. Diagram / picture. 6. Proof of correctness and performance
analysis.
CS5234 Overview
![Page 35: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/35.jpg)
q How to draw pictures?By hand:
Either submit hardcopy, or scan, or take a picture with your phone!
Or use a tablet / iPad…
Digitally: 1. xfig (ugh)2. OmniGraffle (mac)3. Powerpoint (hmmm)4. ???
CS5234 Overview
![Page 36: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/36.jpg)
q Policy on plagiarism:Do your work yourself:
Your submission should be unique, unlike anything else submitted, on the web, etc.
Discuss with other students: 1. Discuss general approach and techniques.2. Do not take notes.3. Spend 30 minutes on facebook (or equiv.).4. Write up solution on your own. 5. List all collaborators.
Do not search for solutions on the web:Use web to learn techniques and to review material from class.
CS5234 Overview
![Page 37: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/37.jpg)
q Policy on plagiarism:Penalized severely:
First offense: minimum of one letter grade lost on final grade for class (or referral to SoC disciplinary committee).
Second offense: F for the class and/or referral to SoC.
Do not copy/compare solutions!
CS5234 Overview
![Page 38: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/38.jpg)
Introduction to Algorithms– Cormen, Leiserson, Rivest, Stein
Algorithms Review
![Page 39: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/39.jpg)
Algorithm Design– Kleinberg and Tardos
Algorithms Review
![Page 40: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/40.jpg)
q Sampling and Sketching Very Big Graphs
q Efficient Algorithms for Modern Machines
A modern twist on classic problems…BFS, DFS, MST, Shortest Path, etc.
Topics (tentative)
![Page 41: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/41.jpg)
q Sampling and Sketching Very Big GraphsPart 1: Graph properties in less than linear time
Connectivity
Connected components
Minimum spanning tree
Average degree
Approximate diameter
Matching
Topics (tentative)
![Page 42: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/42.jpg)
q Sampling and Sketching Very Big GraphsPart 2: Sketches and streams
Sampling from a stream
L0-samplers
Graph sketches
Connectivity
Minimum spanning trees
Triangle counting
Topics (tentative)
![Page 43: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/43.jpg)
q Efficient Algorithms for Modern MachinesPart 3: Caching
Cache-efficient algorithms
BFS
Priority queues
Shortest path
Minimum spanning trees
Topics (tentative)
![Page 44: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/44.jpg)
q Efficient Algorithms for Modern MachinesPart 4: Parallel Algorithms
Fork-join parallelism
Map-Reduce
BFS / DFS
Shortest path
Topics (tentative)
![Page 45: CS5234 Combinatorial and Graph Algorithmsgilbert/CS5234/2017/lectures/01... · A modern twist on classic problems ... containing 4 billion nodes and 1.2 trillion edges? (The facebookgraph,](https://reader033.vdocuments.mx/reader033/viewer/2022042917/5f58fd4c6d6a426b4e408752/html5/thumbnails/45.jpg)
CS5234Combinatorial and Graph Algorithms
Welcome!