lecture 3 nearest neighbor algorithms
DESCRIPTION
Lecture 3 Nearest Neighbor Algorithms. Shang-Hua Teng. What is Algorithm?. A computable set of steps to achieve a desired result from a given input Example: Input: An array A of n numbers Desired result Pseudo-code of Algorithm SUM. Pseudo-code of Algorithm SUM. Complexity: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/1.jpg)
Lecture 3Nearest Neighbor Algorithms
Shang-Hua Teng
![Page 2: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/2.jpg)
What is Algorithm?
• A computable set of steps to achieve a desired result from a given input
• Example:– Input: An array A of n numbers– Desired result
• Pseudo-code of Algorithm SUM
naaa 21
n
kka
1
![Page 3: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/3.jpg)
Pseudo-code of Algorithm SUM
s
ass
n k
as
aaaA
k
n
return
to2for
SUM Algorithm
1
21
Complexity: • Input Size n• Number of steps: n-1 additions
![Page 4: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/4.jpg)
Example 2:Integer Multiplication
c = a b• When do we need to multiply two very
large numbers?– In Cryptography and Network Security
• message as numbers
• encryption and decryption need to multiply numbers
![Page 5: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/5.jpg)
How to multiply 2 n-bit numbers ************ ************
************ ************ ************ ************ ************ ************ ************ ************ ************ ************ ************ ************
************************
operationsbit
Complexity
bits 2 :SizeInput
2n
n
![Page 6: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/6.jpg)
Asymptotic Notation of Complexity
• as input size grows, how fast does the running time grow.– T1(n) = 100 n– T2(n) = n2
• Which algorithm is better?• When n < 100 is small then T2 is smaller• As n becomes larger, T2 grows much faster• To solve large-scale problem, algorithm1 is
preferred.
![Page 7: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/7.jpg)
Asymptotic Notation(Removing the constant factor)
• TheNotation
(g(n)) = { f(n): there exist positive c1 and c2 and
n0 such that
for all n > n0}
• For example T(n) = 4nlog n + n = (nlog n)
• For example n – 1 = (n)
)()()(0 21 ngcnfngc
![Page 8: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/8.jpg)
Asymptotic Notation(Removing the constant factor)
• TheBigNotation
O(g(n)) = { f(n): there exist positive c and
n0 such that
for all n > n0}
• For example T(n) = 4nlog n + n = (nlog n)
• But also T(n) = 4nlog n + n = (n2)
)()(0 ncgnf
![Page 9: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/9.jpg)
Nearest Neighbor Problem:General Formulation
pPp
ppp
dn
n
point tonearest thepoint each For
:Output
P
dimensions in points ofset a :Input
21
![Page 10: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/10.jpg)
Nearest Neighbor Problem
![Page 11: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/11.jpg)
Applications
• Points could be web-page, closest neighbor is the most similar web-page
• Points could be people, closest neighbor could be the best friend
• Points could be biological spices, the closest neighbor could be the closest spice
• …
![Page 12: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/12.jpg)
An O(dn2) time Algorithm
distNN
jiNNi,jdidist
idisti,jdji
n j
i dist
n i
p ||pi,jd
njni
pppP
ji
n
,return
][; ][
then
][ and if
to1for
to1for
|| compute
],1[],,1[ allfor
NN Algorithm 21
Why O(dn2) time?
![Page 13: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/13.jpg)
Can We do better?
• Yes, Handout #4, by Jon Louis Bentley
timelog 1, dFor
timelog1When 1 nn O
nn , O d d
![Page 14: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/14.jpg)
One-Dimensional Geometry
If we can order points from small to large, then we just need to look at the left neighbor and right neighbor of each point to find its nearest neighbor
![Page 15: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/15.jpg)
Reduce to Sorting
• Input: Array A[1...n], of elements in an arbitrary order; array size nOutput: Array A[1...n] of the same elements, but in the non-decreasing order
![Page 16: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/16.jpg)
Divide and Conquer
• Divide the problem into a number of sub-problems (similar to the original problem but smaller);
• Conquer the sub-problems by solving them recursively (if a sub-problem is small enough, just solve it in a straightforward manner.
• Combine the solutions to the sub-problems into the solution for the original problem
![Page 17: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/17.jpg)
Algorithm Design Paradigm I
• Solve smaller problems, and use solutions to the smaller problems to solve larger ones– Divide and Conquer
• Correctness: mathematical induction
![Page 18: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/18.jpg)
Merge Sort
• Divide the n-element sequence to be sorted into two subsequences of n/2 element each
• Conquer: Sort the two subsequences recursively using merge sort
• Combine: merge the two sorted subsequences to produce the sorted answer
• Note: during the recursion, if the subsequence has only one element, then do nothing.
![Page 19: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/19.jpg)
Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array
A[p..r] using divide and conquer
• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then
• Merge-Sort(A,p,q)
• Merge-Sort(A,q+1,r)
• Merge(A,p,q,r)
• Starting by calling Merge-Sort(A,1,n)
2/)( rpq
![Page 20: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/20.jpg)
A = MergeArray(L,R)Assume L[1:s] and R[1:t] are two sorted arrays of elements: Merge-Array(L,R) forms a single
sorted array A[1:s+t] of all elements in L and R.
• A = MergeArray(L,R)– – – for k 1 to s + t
• do if– then
– else
1];[][ iiiLkA1];[][ jjjRkA
]1[;]1[ tRsL
][][ jRiL
1;1 ji
![Page 21: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/21.jpg)
Complexity of MergeArray
• At each iteration, we perform 1 comparison, 1 assignment (copy one element to A) and 2 increments (to k and i or j )
• So number of operations per iteration is 4.
• Thus, Merge-Array takes at most 4(s+t) time.
• Linear in the size of the input.
![Page 22: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/22.jpg)
Merge (A,p,q,r)Assume A[p..q] and A[q+1..r] are two sorted
Merge(A,p,q,r) forms a single sorted array A[p..r].
• Merge (A,p,q,r)– – – –
]1[;]1[ tRsL
;;1 qrtpqs
],1[];..[ rqARqpAL
),(]..[ RLMergeArrayrpA
![Page 23: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/23.jpg)
Merge-Sort(A,p,r)A procedure sorts the elements in the sub-array
A[p..r] using divide and conquer
• Merge-Sort(A,p,r)– if p >= r, do nothing– if p< r then
• Merge-Sort(A,p,q)
• Merge-Sort(A,q+1,r)
• Merge(A,p,q,r)
2/)( rpq
![Page 24: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/24.jpg)
Running Time of Merge-Sort
• Running time as a function of the input size, that is the number of elements in the array A.
• The Divide-and-Conquer scheme yields a clean recurrences.
• Assume T(n) be the running time of merge-sort for sorting an array of n elements.
• For simplicity assume n is a power of 2, that is, there exists k such that n = 2k .
![Page 25: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/25.jpg)
Recurrence of T(n)
• T(1) = 1
• for n > 1, we have
nnTnT 4)2/(2)(
nnTnT
4)2/(2
1)(
if n = 1
if n > 1
![Page 26: Lecture 3 Nearest Neighbor Algorithms](https://reader036.vdocuments.mx/reader036/viewer/2022062423/56814868550346895db57534/html5/thumbnails/26.jpg)
Solution of Recurrence of T(n)
T(n) = 4 nlog n + n = O(nlog n)
• Picture Proof by Recursion Tree