hw2

3
dsa homework #2 due date : APRIL 14, 2015 Name : Jui-Hui Chung ID : B02202008 1 More about C++ (1) Because integer c is a local variable inside the domain of sub1, it will be destroyed when exiting and the reference return type will result in segmentation fault (probably). (2) On most of the operating systems, programs are not permitted to access memory at address 0 because that memory is reserved by the operating system. if inadvertently a - b =0, then NULL is assigned to pointer and further assignment will lead to run-time error 2 Arrays, Linked List, and Recursion (1) First, sort with some non-comparision based algorithms, counting sort, radix sort, bucket sort, etc ; O(n). Then search from first index for five duplicated items in linear time O(n). Total complexity is O(n), with some extra space needed for sorting. (2) Compress to 1-D array in doubly-linked list data structure. Each node (i, j, k) specify three properties : the row index, the column index and the entry. Construct the list with the row index and then the column index sorted. Suppose you want to get a value on location [i][j ], start from first node and follow i · (i - 1)/2+ j steps to the desired node. (3) A comparison in linear time may be possible. Store the cursorL initial location and compare cursorM to cursorL. If they are the same advance both cursor for one step and continue. If they are not the same at some instance, jump cursorL to its inial location and then compare again cursorM to cursorL. Finally if cursorM traverses for steps twice as big as their size (enough to tell whether any of cursorM matches exact location of cursorL), then they are not the same. If cursorL traverses exactly the size to the list, then they are identical. (4) The pseudo code function is as follow Paritysort(A, p, q) 1 if p<q 2 (r, s)= Checkparity(A, p, q) 3 Paritysort(A, r, s) Checkparity(A, p, q) 1 if p is even and q is odd 2 return (p +1,q - 1) 3 else if p is even and q is even 4 return (p +1,q) 1

Upload: ahui-chung

Post on 24-Sep-2015

218 views

Category:

Documents


1 download

DESCRIPTION

HW2

TRANSCRIPT

  • dsa homework #2due date : APRIL 14, 2015

    Name : Jui-Hui ChungID : B02202008

    1 More about C++

    (1) Because integer c is a local variable inside the domain of sub1, it will be destroyed whenexiting and the reference return type will result in segmentation fault (probably).

    (2) On most of the operating systems, programs are not permitted to access memory ataddress 0 because that memory is reserved by the operating system. if inadvertentlya b = 0, then NULL is assigned to pointer and further assignment will lead to run-timeerror

    2 Arrays, Linked List, and Recursion

    (1) First, sort with some non-comparision based algorithms, counting sort, radix sort, bucketsort, etc ; O(n). Then search from first index for five duplicated items in linear time O(n).Total complexity is O(n), with some extra space needed for sorting.

    (2) Compress to 1-D array in doubly-linked list data structure. Each node (i, j, k) specifythree properties : the row index, the column index and the entry. Construct the listwith the row index and then the column index sorted. Suppose you want to get a valueon location [i][j], start from first node and follow i (i1)/2+j steps to the desired node.

    (3) A comparison in linear time may be possible. Store the cursorL initial location andcompare cursorM to cursorL. If they are the same advance both cursor for one step andcontinue. If they are not the same at some instance, jump cursorL to its inial locationand then compare again cursorM to cursorL. Finally if cursorM traverses for steps twiceas big as their size (enough to tell whether any of cursorM matches exact location ofcursorL), then they are not the same. If cursorL traverses exactly the size to the list,then they are identical.

    (4) The pseudo code function is as follow

    Paritysort(A, p, q)1 if p < q2 (r, s) = Checkparity(A, p, q)3 Paritysort(A, r, s)

    Checkparity(A, p, q)1 if p is even and q is odd2 return (p+ 1, q 1)3 else if p is even and q is even4 return (p+ 1, q)

    1

  • 5 else if p is odd and q is odd6 return (p, q 1)7 else if p is odd and q is even8 exchange A[p] with A[q]9 return (p+ 1, q 1)

    (5) The pseudo code resembles the recursive one, but less readable

    1 Paritysort(A, p, q)2 while p < q3 if p is even and q is odd4 p = p+ 15 q = q 16 else if p is even and q is even7 p = p+ 18 else if p is odd and q is odd9 q = q 110 else if p is odd and q is even11 exchange A[p] with A[q]12 p = p+ 113 q = p 1

    3 Asymptotic Complexity

    (1) Suppose d(n) = n2 + 2n and e(n) = n2 + n then f(n) = n2 and g(n) = n2. Thusd(n) e(n) = n 6= O(0) which can be also constant O(1). A contradiction.

    (2) (n + 1)5 = n5 + 5n4 + 10n3 + 10n2 + 5n + 1 (1 + 5 + 10 + 10 + 5 + 1)n5 = cn5 forc = 34, when n n0 = 1.

    (3) Obviously big O notation refers to asymptotic case when the number of input becomes sobig. For 25nlogn = O(nlogn) and n2 = O(n2), Wolfram calculated that when n < 119,the former will be bigger.

    4 Playing with Big Data

    (1) The main data structure is hash table implemented with C++ STL unordered_multimap.Much like map containers, unordered_multimap allows different elements to have equivalentkeys. The basic operation in the hash table takes constant time in average case.

    get(u, a, q, p, d) : equal_range() to search, average case complexity constant whileworst case linear in container size. for loop to compare the properties, complexity linearin number of properties of user. get(u, a, q, p, d) = O(#of properties of user) in averagecase.

    2

  • clicked(u) : equal_range() to search, average case complexity constant while worstcase linear in container size. for loop to store copies of AdID and query, complexity linearin number of properties of user. std : : sort (linearithmic in the distance of iterator) thequery first than std : : stable_sort (polyloglinear in that distance) the AdID. std : :unique (linear in the distance) to eliminate the duplicate information. for loop standardoutput pairs of AdID and query. clicked(u) = O(Nlg2N), N is the number of AdID ofuser. The complexity is in average case due to the limit of hash table.

    impressed(u1, u2) : equal_range() to search, average case complexity constant whileworst case linear in container size. for loop to store copies of index and AdID, complexitylinear in number of properties of user. std : : sort (linearithmic in the distance) bothof the stored copies by AdID. while loop (linear in the distance) to compare the sortedAdID. Another while loop (qudratic time but the data to be compared here are fairlysmall) to eliminate the duplicate information. impressed(u1, u2) = O(#of AdID in user)+O(NlgN , N is the number of AdID of user).

    profit(a, ) : equal_range() to search, average case complexity constant while worstcase linear in container size. for loop to store copies of index and user, complexity linearin number of properties of user. while loop (linear time) add up click-through-rate andcompare to the profit. profit(a, ) = O(NlgN , N is the number of user of AdID).

    It takes around five minutes to load the data while constructing the hash table.Standard input and output uses and withstd : : ios : : sync_with_stdio(false).

    Each of the four operations takes less than a second to complete. In order tomaximize the effificiency of individual member functions and to better known the dataprovided (maybe its cheating ?), I wrote a utility function to check for the extreme case :User 0 contains 37733352 entries and AdID contains 1350400 entries. Thus I seperate theUser 0 implementation. Afterwards, I try to implement multi-thread programming withOpenMP. The real time elapsed nearly speed up ten times, but the CPU times consumedis much greater though.

    I finished the homework myself and the hints provided by TA are of great helps.

    3