I highlight the key points covered in the ANU COMP6466 Algorithm Course. This course aims to develop students’ proficiency in algorithm analysis and design. The first five weeks focused on algorithm analysis, sort and search algorithm, particularly deterministic and randomized algorithms, which I found very inspiring.
These notes serve as a quick refresher of the course content for me, so I’ve omitted many mathematical notations.
Introduce the Insertion Sort, and its pseudocode is as follows, and the right side is the times the code executed:
for j = 2 to A.length # (n-1)
key = A[j] # (n-1)
while i > 0 and A[i] > key: # SUM(t_j) j from 2 to n
A[i+1] = A[i] # SUM(t_j - 1) j from 2 to n
i-- # SUM(t_j - 1) j from 2 to n
A[i] = key # (n-1)
Time Complexity Analysis: \(c1(n-1) + c2(n-1) + c3 \sum_{j=2}^{n}{t_j}+c4 \sum_{j=2}^{n}{t_j-1}+c5(n-1)\) \(= c'n + c'' \sum_{j=2}^{n}{t_j}\) \(<= c'n + c'' \sum_{j=2}^{n}{j}\) \(= c'n + c'' (2+n) * (n-1) / 2\) \(= O(n^2)\)
Notations | Equality |
---|---|
f(n) = O(g(n)) | 0 <= f(n) <= c*g(n) |
f(n) = o(g(n)) | 0 <= f(n) < c*g(n) |
f(n) = W(g(n)) | 0 < c*g(n) <= f(n) |
f(n) = w(g(n)) | 0 < c*g(n) < f(n) |
f(n) = sigma(g(n)) | 0 < c1g(n) <= f(n)<= c2g(n) |
So for all the f(n), will it satisfy f(n)=O(g(n)) or \sigma(g(n)) or \Omiga(g(n))? No, e.g. f(n) = n, g(n) = n^(1+sin(n)).
1 < log(n) < n < nlog(n) = log(n!) < n^2 < n^3 < 2^n < n!
This part is one of the most important parts in this course. Three methods are introduced to help making recurrence Analysis:
The time cost of Recurrence Algorithms could always be written in this format: T(n) = a*T(n/b) + cf(n).
For the substitution method, we firstly make substitution for the T(n/k) and guess the law of T(n). Then we use Mathematical Induction to prove our guess.
For the Recursion Tree Method, draw the recursion tree, figure out the level, and the total time cost. Then guess what T(n) looks like as well. Finally, using mathematical induction to prove our guess.
For the Master Theorem, remember all the formulas. Master Theorem is taught in week 3.
Consider T(n)=a*T(n/b)+f(n), a>=1, b>=1.
When we have i-1 types of coupons, the probability that we get the i-th type coupon is:
\[P_i = 1-(i-1)/n = (n-i+1) / n\]Therefore, E[x_i] = 1/ p_i = n / (n-i+1)
\[\Sum_{i=1}^{n}{E[x_i]} = \Sum_{i=1}^{n}{(n)/(n-i+1)} = n * \Sum_{i=1}^{n}{1/(n-i+1) = n* \Sum_{j=0}^{n-i}{1/(j+1)}} = n*log(n) + c*n\]In the Insertion Sort, the random step happens as we do the inversion of A[i] and A[j]. Suppose X is the total number of the inversion times. x_{i, j} = 1 is the event that A[i] and A[j] need inversion.
P_{i, j} = 1/2
E[x] = E[\Sum_{i=1}^{n}\Sum_{j=i+1}^{n}{x_{i, j}}] = \Sum_{i=1}^{n}\Sum_{j=i+1}^{n}{P_{i, j}}
=\Sum_{i=1}^{n}\Sum_{j=i+1}^{n}{1/2} = n*(n-1)/4
The number of comparison between A[i] and A[j] is random. Let X be the number of comparison, x_{i, j} = 1 be the event that A[i] and A[j] compare. for a sequence from i to j. P_{i, j} = P(i as pivot) + P(j as pivot) = 1/ (j-i+1) + 1/(j-i+1) = 2/ (j-i+1)
Therefore, E[x] = E[\Sum_{i=1}^{n}\Sum_{j=i+1}^{n}{x_{i, j} = 1}] = E[\Sum_{i=1}^{n}\Sum_{j=i+1}^{n}{2/(j-i+1)}] = 2 * \Sum_{i=1}^{n}(log(n)+r) = 2nlog(n) + c
To verify the matrix multiplication A * B = C, we have basically deterministic algorithms, which takes O(n^3) and O(n^2.37) separately. However, with randomized algorithm, we reduce the complexity to O(n^2).
Steps are:
Then we can repeat the original algorithm k times. If all the outputs are Yes, then return Yes, otherwise return No. The probability it returns Yes while the answer is actually No is atmost 1 / 2^k.
The time complexity of this randomized algorithm is O(k*n^2) = O(n^2).
Two important concepts of sorting algorithm is in place and stable.
In place means the arrangement happens inside the array; does not need additional memory.
Stable means elements with the same values appear in the output array in the same order as they appear in the input array.
Sort Methods-> | Insertion Sort | Merge Sort | RandQuick Sort | Count Sort | Bucket Sort | Radix Sort |
---|---|---|---|---|---|---|
In place | Yes | No | Yes | No | No | No |
Stable | Yes | Yes | No | Yes | Yes | Yes |
Bucket Sort use Insertion Sort inside its lists, therefore it’s stable;
Radix Sort uses Count Sort when comparing the same digit, therefore not in place but stable.
Count Sort: O(N+M)
Bucket Sort: Average Complexity O(n)
Radix Sort: Average Complexity O(d*n)
O(nlog(n))