com1721: freshman honors seminar
DESCRIPTION
COM1721: Freshman Honors Seminar. A Random Walk Through Computing Rajmohan Rajaraman Tuesdays, 5:20 PM, 149 CN. Introduction. Explore a potpourri of concepts in computing. Theory, examples, and applications Readings: Handouts and WWW Grading: Quizzes, homework, and class participation. - PowerPoint PPT PresentationTRANSCRIPT
COM1721: Freshman Honors Seminar
A Random Walk Through Computing Rajmohan Rajaraman
Tuesdays, 5:20 PM, 149 CN
Introduction Explore a potpourri of concepts in
computing1: a mixture of flowers, herbs, and spices that is usually kept in a jar and used for scent2: a miscellaneous collectionEtymology: French pot pourri, literally rotten pot
Theory, examples, and applications
Readings: Handouts and WWW Grading: Quizzes, homework, and
class participation
Sample Concepts Abstraction Modularity Randomization Recursion Representation Self-reference …
Sample Topics Dictionary search Structure of the Web Self-reproducing programs Undecidability Private communication Relational databases Quantum computing, bioinformatics,…
Abstraction A view of a problem that extracts the
essential information relevant to a particular purpose and ignores inessential details
Driving a car: We are provided a particular abstraction of the car
in which we only need to know certain controls Building a house:
Different levels of abstraction for house owner, architect, construction manager, real estate agent
Related concepts: information hiding, encapsulation, representation
Modularity Decomposition of a system into
components, each of which can be implemented independent of the others
Foundation for good software engineering
Design of a basic processor from scratch
Representation To portray things or relationship
between things Knowledge representation: model
relationship among objects as an edge-labeled graph
Data representation: bar graphs, histograms for statistics
Querying a dictionary; Web as a graph
Randomization An algorithmic technique that uses
probabilistic (rather than deterministic) selection
A simple and powerful tool to provide efficient solutions for many complex problems
Has a number of applications in security Cryptography and private
communication
Recursion A way of specifying a process by
means of itself Complicated instances are defined in
terms of simpler instances, which are given explicitly
Closely tied to mathematical induction
Fibonacci numbers
Self-reference A statement/program that refers to itself Examples:
“This statement contains five words” “This statement contains six words” “This statement is not self-referential” “This statement is false”
Important concept in computing theory Undecidability of the halting problem, self-
reproducing programs Gödel Escher Bach: an Eternal Golden Braid,
Douglas Hofstader
Illustration: Representation Problem: Derive an expression for
the sum of the first n natural numbers
1 + 2 + 3 + … + n-2 + n-1 + n = ?
Sum of First n Natural Numbers1 + 2 + 3 + … + 98 + 99 + 100 = S100 + 99 + 98 + … + 3 + 2 + 1 = S
101 + 101 + 101 + … + 101 + 101 = 2S S = 100*101/2
S = n(n+1)/2
A Different Representation
123
A “Geometric Derivation”
54
1)n(n S2
Other Equalities Sum of first n odd numbers
1 + 3 + 5 + … + 2n-1 = ?
Sum of first n cubes 1 + 4 + 9 + 16 + … + n^3 = ?
Representation and Programming Representation is the essence of
programming Brooks, “The Mythical Man-
Month” Data structures
Dictionary A collection of words with a
specified ordering Dictionary of English words Dictionary of IP addresses Dictionary of NU student names
Searching a Dictionary Suppose we have a dictionary of
100,000 words Consider different operations
Search for a word List all anagrams of a word Find the word matching the largest
prefix What representation (data structure)
should we choose?
Search for a Word Store the words in sorted order in
a linear array Unsuccessful search:
compare with 100,000 words Successful search:
on average, compare with 50,000 words
Twenty Questions Compare with 50,000th word If match, then done If further in dictionary order, search right
half If earlier in dictionary order, search left half Until word found, or search space empty Recursion Binary search
How Many Questions? ajuma
alderaanalpheratzamberdaliescherpicassoreliablerenoiryukon
vangogh
How Many Questions? Question # Search space
0 100,0001 50,0002 25,0003 12,5005 3,12510 10015 417 1
Anagrams An anagram of a word is another
word with the same distribution of letters, placed in a different order
Input: deposit Output: posited, topside, dopiest Anagrams: subessential
suitableness
Detecting Anagrams How do you determine whether
two words X and Y are anagrams? Compare the letter distributions Time proportional to number of
letters in each word Suppose this subroutine
anagram(X,Y) is fast
Listing Anagrams of a Word Dictionary of 100,000 English words List all anagrams of least How should we represent the
dictionary? Linear array
Loop through dictionary: if anagram(X,least), include X in list
Running time = 100,000 calls to anagram()
A Different Data Structure If X and Y are anagrams of each other,
they are equivalent; the list of anagrams of X is same as the list for Y
This indicates an equivalence class of anagrams!
deposit posited topside dopiest race care acre adroitly dilatory idolatry
Anagram Signatures Would like to store anagrams in the
same class together How do we identify a class? Assign a signature!
Sort all the letters in the anagram word(s) Same for each word in a class!acre race care: acerdeposit posited topside dopiest: deiopst subessential suitableness:
abeeilnssstu
Anagram Program
acrepotsstopcarepostsnap
acer: acreopst: potsopst: stopacer: careopst: postanps:snap
acer: acreacer: careanps:snapopst: potsopst: stopopst: post
sign sort
Anagram Program
acer: acre careanps: snapopst: pots stop post
merge
acer: acreacer: careanps:snapopst: potsopst: stopopst: post
Listing Anagrams for Given Word X Compute sign(X) and lookup
sign(X) in dictionary using binary search
List all words in list adjacent to sign(X)post
opstsign
lookup
acer: acre careanps: snapopst: pots stop post
Efficiency of Anagram Program Once dictionary has been stored in new
representation: Lookup takes at most 17 queries Listing time is proportional to number of
anagrams in the class What about the cost of new representation?
Sign each word, sort, and merge Expensive, but need to do it only once!
Preprocessing
References Programming Pearls, by Jon
Bentley, Addison-Wesley Great Ideas in Theoretical
Computer Science, Steven Rudich A course at CMU