First, we will analyze each sorting algorithm in the general case, assigning values to the cost of specific operations. The following table lists my operation costs. Note that logical conditions are assumed to be evaluated in full, i.e., no short-circuit and's or or's.
Operation | Cost |
Constant | 0 |
Variable lookup | 1 |
Array lookup | 2 |
Arithmetic | 2 |
Assignment | 1 |
Comparison | 1 |
Logical and | 1 |
If-then-else | 2 |
For statement | 3 |
While statement | 2 |
Function call | 4 |
InsertionSort: Following the analysis on page 8 and using the above table, we can assign the following costs: c_{1} = 4, c_{2} = 5, c_{4} = 4, c_{5} = 10, c_{6} = 8, c_{7} = 4, c_{8} = 6. Then, all we need is to determine the values of t_{j}.
MergeSort: Referring to the MERGESORT pseudocode on page 13, we have the following costs and iterations, where t_{1} = 0 for the stopping condition and T_{M}(n) is the running time of MERGE.
Line | Cost | Iterations |
1 | 5 | 1 |
2 | 10 | t_{1} |
3 | t_{1} | |
4 | t_{1} | |
5 | T_{M}(n) + 8 | t_{1} |
This yields the following recurrence:
We will also need to add an extra n for a one-time declaration of the auxiliary B array used in MERGE. Referring to the pseudocode for MERGE in the class notes, we have the following costs and iterations.
Line | Cost | Iterations | Line | Cost | Iterations |
1 | 5 | n+1 | 10 | 5 | t_{1}(1-t_{2}) |
2 | 7 | n | 11 | 5 | t_{1} |
3 | 3 | 1 | 12 | 5 | 1 |
4 | 5 | 1 | 13 | 3 | t_{3} |
5 | 9 | t_{1} + 1 | 14 | 3 | t_{3} |
6 | 9 | t_{1} | 15 | 5 | n - t_{1} + 1 |
7 | 7 | t_{1} t_{2} | 16 | 7 | n - t_{1} |
8 | 5 | t_{1} t_{2} | 17 | 5 | n - t_{1} |
9 | 7 | t_{1}(1-t_{2}) |
Summing the Cost*Iterations, we get T_{M}(n) = 29n + 18t_{1} + 6t_{3} + 24, where t_{1} is the number of elements copied within the while loop, and t_{3} = 1 of the remaining elements are in the right side (otherwise t_{3} = 0).
HeapSort: Referring to the HEAPSORT pseudocode on page 147, we have the following costs and iterations, where T_{B}(n) is the cost of calling BUILDHEAP and T_{H}(n) is the cost of calling HEAPIFY on the ith iteration.
Line | Cost | Iterations |
1 | T_{B}(n)+5 | 1 |
2 | 4 | n |
3 | 15 | n-1 |
4 | 5 | n-1 |
5 | T_{H}(n-i+1)+5 | n-1 |
Summing Cost*Iterations yields . Referring to the BUILDHEAP pseudocode on page 145, we have the following costs and iterations, where the calls to HEAPIFY may have different costs than the calls in HEAPSORT, even though the arguments are the same.
Line | Cost | Iterations |
1 | 3 | 1 |
2 | 7 | |
3 | T_{H}^{'}(n-i+1)+6 |
Summing Cost*Iterations yields . Referring to the HEAPIFY pseudocode on page 143, we have the following costs and iterations.
Line | Cost | Iterations |
1 | 9 | 1 |
2 | 11 | 1 |
3 | 13 | 1 |
4 | 3 | t_{1} |
5 | 3 | 1 - t_{1} |
6 | 13 | 1 |
7 | 3 | t_{2} |
8 | 5 | 1 |
9 | 17 | t_{3} |
10 | T_{H}^{''}(t_{4})+6 | t_{3} |
Summing Cost*Iterations yields T(n) = t_{3} T_{H}^{''}(t_{4}) + 21 t_{3} + 3 t_{2} + 54, where t_{2}=1 if the right child is largest, t_{3}=1 if a violation occurs, and t_{4} is the number of keys in the heap rooted at A[largest].
Evaluating these running time expressions analytically is too tedious, so I implemented the algorithms and kept a counter according to the costs described above.
QuickSort: Referring to the QUICKSORT pseudocode on page 154, we have the following costs and iterations.
Line | Cost | Iterations |
1 | 5 | 1 |
2 | T_{P}(n)+9 | t_{1} |
3 | T(t_{2})+7 | t_{1} |
4 | T(n-t_{2})+9 | t_{1} |
Summing Cost*Iterations yields the following recurrence, where t_{2} is the number of elements in the left subarray after PARTITION.
Referring to the PARTITION pseudocode on page 154, we have the following costs and iterations.
Line | Cost | Iterations |
1 | 5 | 1 |
2 | 5 | 1 |
3 | 5 | 1 |
4 | 2 | t_{1} |
5 | 5 | t_{2} |
6 | 7 | t_{2} + 1 |
7 | 5 | t_{3} |
8 | 7 | t_{3} + 1 |
9 | 5 | t_{1} |
10 | 17 | t_{1} - 1 |
11 | 2 | 1 |
where t_{1} - 1 is the number of swaps that take place, and t_{2} + t_{3} = n+2. Thus, T_{P}(n) = 12n + 24 t_{1} + 36. Evaluating these running time expressions analytically is too tedious, so I implemented the algorithms and kept a counter according to the costs described above.
CountingSort: Referring to the pseudocode on page 176, we get the following costs and iterations.
Line | Cost | Iterations |
1 | 4 | k+1 |
2 | 4 | k |
3 | 4 | n+1 |
4 | 13 | n |
5 | 0 | 1 |
6 | 4 | k |
7 | 14 | k-1 |
8 | 0 | 1 |
9 | 4 | n+1 |
10 | 11 | n |
11 | 13 | n |
Summing the Cost*Iterations and adding k for the cost of allocating array C, we get T(n,k) = 45n + 27k - 2, where .
InsertionSort: t_{2} = 2, t_{3} = 1, t_{4} = 3, t_{5} = 1, t_{6} = 5, t_{7} = 2, t_{8} = 4, t_{9} = 2, t_{10} = 2. Thus, and . Plugging these values into the equation on page 8 yields T(n) = 551.
MergeSort: Expanding the recurrence for T(10), we get
where all the T_{M}'s are left in the equation, because each one may have a different t_{1} and t_{3}. For all of the T_{M}(i) terms, t_{1} = i-1, except for the 4th, 7th and 8th terms where t_{1} = i-2. And, t_{3} = 1 for all but the 1st and 9th terms. Plugging all this in (not forgetting to add an extra n=10 for allocation of B), we get T(n) = 1632.
HeapSort: T(n) = 2585.
QuickSort: T(n) = 1678.
CountingSort: For n=10 and k=23, T(10,23) = 1069.
So, INSERTIONSORT is the fastest.
InsertionSort: t_{2} = 1, t_{3} = 2, t_{4} = 1, t_{5} = 5, t_{6} = 3, t_{7} = 2, t_{8} = 5, t_{9} = 5, t_{10} = 3. Thus, and . Plugging these values into the equation on page 8 yields T(n) = 661.
MergeSort: Expanding the recurrence for T(10), we get
where all the T_{M}'s are left in the equation, because each one may have a different t_{1} and t_{3}. For all of the T_{M}(i) terms, t_{1} = i-1, except for the 7th term where t_{1} = 1. And, t_{3} = 1 for all but the 2nd, 8th and 9th terms. Plugging all this in (not forgetting to add an extra n=10 for allocation of B), we get T(n) = 1662.
HeapSort: T(n) = 2582.
QuickSort: T(n) = 1738.
CountingSort: For n=10 and k=8, T(10,8) = 664.
So, INSERTIONSORT and COUNTINGSORT are the fastest.
InsertionSort: This is the worst case scenario for InsertionSort, so the equation on page 9 applies. Plugging n=100 and the above c_{i}'s yields T(n) = 11n^{2} + 15n - 25, and T(100) = 111475.
MergeSort: Because the array is reverse sorted, the MERGE algorithm will always be merging two arrays, where the left array elements are greater than the right array. Therefore, all of the elements in the right array will be copied within the for loop ( and t_{3} = 0). Thus, for all calls to MERGE, and the recurrence becomes:
Executing this recurrence for n=100 yields (not forgetting to add an extra 100 for the B array allocation) T(n) = 32238.
HeapSort: T(n) = 44232.
QuickSort: T(n) = 71508
CountingSort: For n=100 and k=100000, T(100,100000) = 2704498.
So, MERGESORT is the fastest with HEAPSORT running a close second.