CSE 2320 Section 501/571 Fall 1999

Homework 2 Solution

1.
Operation of HEAPSORT on the array .

2.
Binary tree illustrating the execution of QUICKSORT on the array 10, 4, 15, 7, 23, 6, 15, 14, 16, 20 .

3.
Operation of COUNTING-SORT on the array .

4.
For each of the following arrays, select which of the five sorts INSERTIONSORT, MERGESORT, HEAPSORT, QUICKSORT or COUNTINGSORT (as implemented in the textbook) will run fastest (non-asymptotically speaking). Describe any assumptions about the cost of operations within the sorting algorithms and justify your answers.

First, we will analyze each sorting algorithm in the general case, assigning values to the cost of specific operations. The following table lists my operation costs. Note that logical conditions are assumed to be evaluated in full, i.e., no short-circuit and's or or's.

 Operation Cost Constant 0 Variable lookup 1 Array lookup 2 Arithmetic 2 Assignment 1 Comparison 1 Logical and 1 If-then-else 2 For statement 3 While statement 2 Function call 4

InsertionSort: Following the analysis on page 8 and using the above table, we can assign the following costs: c1 = 4, c2 = 5, c4 = 4, c5 = 10, c6 = 8, c7 = 4, c8 = 6. Then, all we need is to determine the values of tj.

MergeSort: Referring to the MERGESORT pseudocode on page 13, we have the following costs and iterations, where t1 = 0 for the stopping condition and TM(n) is the running time of MERGE.

 Line Cost Iterations 1 5 1 2 10 t1 3 t1 4 t1 5 TM(n) + 8 t1

This yields the following recurrence:

We will also need to add an extra n for a one-time declaration of the auxiliary B array used in MERGE. Referring to the pseudocode for MERGE in the class notes, we have the following costs and iterations.

 Line Cost Iterations Line Cost Iterations 1 5 n+1 10 5 t1(1-t2) 2 7 n 11 5 t1 3 3 1 12 5 1 4 5 1 13 3 t3 5 9 t1 + 1 14 3 t3 6 9 t1 15 5 n - t1 + 1 7 7 t1 t2 16 7 n - t1 8 5 t1 t2 17 5 n - t1 9 7 t1(1-t2)

Summing the Cost*Iterations, we get TM(n) = 29n + 18t1 + 6t3 + 24, where t1 is the number of elements copied within the while loop, and t3 = 1 of the remaining elements are in the right side (otherwise t3 = 0).

HeapSort: Referring to the HEAPSORT pseudocode on page 147, we have the following costs and iterations, where TB(n) is the cost of calling BUILDHEAP and TH(n) is the cost of calling HEAPIFY on the ith iteration.

 Line Cost Iterations 1 TB(n)+5 1 2 4 n 3 15 n-1 4 5 n-1 5 TH(n-i+1)+5 n-1

Summing Cost*Iterations yields . Referring to the BUILDHEAP pseudocode on page 145, we have the following costs and iterations, where the calls to HEAPIFY may have different costs than the calls in HEAPSORT, even though the arguments are the same.

 Line Cost Iterations 1 3 1 2 7 3 TH'(n-i+1)+6

Summing Cost*Iterations yields . Referring to the HEAPIFY pseudocode on page 143, we have the following costs and iterations.

 Line Cost Iterations 1 9 1 2 11 1 3 13 1 4 3 t1 5 3 1 - t1 6 13 1 7 3 t2 8 5 1 9 17 t3 10 TH''(t4)+6 t3

Summing Cost*Iterations yields T(n) = t3 TH''(t4) + 21 t3 + 3 t2 + 54, where t2=1 if the right child is largest, t3=1 if a violation occurs, and t4 is the number of keys in the heap rooted at A[largest].

Evaluating these running time expressions analytically is too tedious, so I implemented the algorithms and kept a counter according to the costs described above.

QuickSort: Referring to the QUICKSORT pseudocode on page 154, we have the following costs and iterations.

 Line Cost Iterations 1 5 1 2 TP(n)+9 t1 3 T(t2)+7 t1 4 T(n-t2)+9 t1

Summing Cost*Iterations yields the following recurrence, where t2 is the number of elements in the left subarray after PARTITION.

Referring to the PARTITION pseudocode on page 154, we have the following costs and iterations.

 Line Cost Iterations 1 5 1 2 5 1 3 5 1 4 2 t1 5 5 t2 6 7 t2 + 1 7 5 t3 8 7 t3 + 1 9 5 t1 10 17 t1 - 1 11 2 1

where t1 - 1 is the number of swaps that take place, and t2 + t3 = n+2. Thus, TP(n) = 12n + 24 t1 + 36. Evaluating these running time expressions analytically is too tedious, so I implemented the algorithms and kept a counter according to the costs described above.

CountingSort: Referring to the pseudocode on page 176, we get the following costs and iterations.

 Line Cost Iterations 1 4 k+1 2 4 k 3 4 n+1 4 13 n 5 0 1 6 4 k 7 14 k-1 8 0 1 9 4 n+1 10 11 n 11 13 n

Summing the Cost*Iterations and adding k for the cost of allocating array C, we get T(n,k) = 45n + 27k - 2, where .

(a)

InsertionSort: t2 = 2, t3 = 1, t4 = 3, t5 = 1, t6 = 5, t7 = 2, t8 = 4, t9 = 2, t10 = 2. Thus, and . Plugging these values into the equation on page 8 yields T(n) = 551.

MergeSort: Expanding the recurrence for T(10), we get

T(10) = TM(2) + TM(2) + TM(3) + TM(5) + TM(2) + TM(2) + TM(3) + TM(5) + TM(10) + 386

where all the TM's are left in the equation, because each one may have a different t1 and t3. For all of the TM(i) terms, t1 = i-1, except for the 4th, 7th and 8th terms where t1 = i-2. And, t3 = 1 for all but the 1st and 9th terms. Plugging all this in (not forgetting to add an extra n=10 for allocation of B), we get T(n) = 1632.

HeapSort: T(n) = 2585.

QuickSort: T(n) = 1678.

CountingSort: For n=10 and k=23, T(10,23) = 1069.

So, INSERTIONSORT is the fastest.

(b)

InsertionSort: t2 = 1, t3 = 2, t4 = 1, t5 = 5, t6 = 3, t7 = 2, t8 = 5, t9 = 5, t10 = 3. Thus, and . Plugging these values into the equation on page 8 yields T(n) = 661.

MergeSort: Expanding the recurrence for T(10), we get

T(10) = TM(2) + TM(2) + TM(3) + TM(5) + TM(2) + TM(2) + TM(3) + TM(5) + TM(10) + 386

where all the TM's are left in the equation, because each one may have a different t1 and t3. For all of the TM(i) terms, t1 = i-1, except for the 7th term where t1 = 1. And, t3 = 1 for all but the 2nd, 8th and 9th terms. Plugging all this in (not forgetting to add an extra n=10 for allocation of B), we get T(n) = 1662.

HeapSort: T(n) = 2582.

QuickSort: T(n) = 1738.

CountingSort: For n=10 and k=8, T(10,8) = 664.

So, INSERTIONSORT and COUNTINGSORT are the fastest.

(c)
Array A[1..100], where A[i] = 1000(101 - i).

InsertionSort: This is the worst case scenario for InsertionSort, so the equation on page 9 applies. Plugging n=100 and the above ci's yields T(n) = 11n2 + 15n - 25, and T(100) = 111475.

MergeSort: Because the array is reverse sorted, the MERGE algorithm will always be merging two arrays, where the left array elements are greater than the right array. Therefore, all of the elements in the right array will be copied within the for loop ( and t3 = 0). Thus, for all calls to MERGE, and the recurrence becomes:

Executing this recurrence for n=100 yields (not forgetting to add an extra 100 for the B array allocation) T(n) = 32238.

HeapSort: T(n) = 44232.

QuickSort: T(n) = 71508

CountingSort: For n=100 and k=100000, T(100,100000) = 2704498.

So, MERGESORT is the fastest with HEAPSORT running a close second.

5.
For each of the following types of data structures, show the final data structure after inserting the keys 4, 2, 6, 7, 6, 5 in this order into an initially empty data structure.

(a)
Stack.

(b)
Queue.

(c)