Example 9.3.4. Complexity of the LinearSearch Algorithm.
Consider the LinearSearch algorithms
Algorithm 9.1.9 and
Algorithm 9.1.10 that find a target value in an array of
\(n\) integers. The algorithm looks at each integer in turn, comparing it to the target. Here, the size of the problem is the number of integers stored in in the list. The basic operation is to compare an integer’s value to that of the target. As with
MaximumElement
above, it is reasonable to assume that it takes a fixed amount of time to do one such comparison, regardless of the value of the two integers or their positions in the array.
Let us call \(c_1\) the amount of time required to compare two integers, \(c_2\) the time required to increment variable index
because this must be done for each value in the list, \(c_3\) the time for the assignment to loc
when the target is found, and \(c_4\) the time taken to initialize index
and loc
. The total time to run LinearSearch
is therefore approximately \((c_1 + c_2)n + c_3 + c_4\text{,}\) because we must make \(n\) comparisons and increments, but the assignment of the target and initialization only happen once. LinearSearch
has a running time expressed by the function
\begin{equation*}
T(n)=(c_1 + c_2)n + (c_3 + c_4) \text{.}
\end{equation*}
The Big O can then be calculated as follows:
\begin{equation*}
\begin{array}{c c}
(c_1 + c_2)n + (c_3 + c_4) & \leq (c_1 + c_2)n + (c_3 + c_4)n \\
& \leq (c_1 + c_2+ c_3 + c_4)n \\
& = O(n)
\end{array}
\end{equation*}
The asymptotic complexity of LinearSearch
is therefore the Big O of \(T(n)\text{.}\) This is a Linear Complexity algorithm because \(T(n) = O(n)\text{.}\)
Example 9.3.5. Complexity of the BinarySearch Algorithm.
Consider the
Algorithm 9.1.12 that finds a target value in a sorted list of
\(n\) integers. The algorithm calculates the middle position and looks at the element in that position in turn, comparing it to the target. Here too, the size of the problem is the number of integers stored in in the list. The basic operation is to compare an integer’s value to that of the target. Assume that it takes a fixed amount of time to do one such comparison.
Let us call \(c_1\) the amount of time required to compare two integers, \(c_2\) the time required to calculate the middle index, \(c_3\) the time to reassign either begin
or end
, and \(c_4\) the time taken to initialize loc, begin
and end
.
An important property to remember for this analysis is if \(n\) is a power of 2, say \(2^m\text{,}\) then \(\log_{2}(n) = m\)
Assume the length of the list is \(n = 2^m\text{.}\) At each iteration, the algorithm divides the list in half so the length of the new list is \(\frac{2^m}{2} = 2^{m-1}\text{.}\) This happens until the remaining list has a single element, it has length \(1 = 2^0\text{.}\) Therefore, the maximum number of splits that can occur is \(m = \log_2(n)\text{.}\)
The total time to run BinarySearch
is therefore approximately \((c_1 + c_2 +c_3)\log_2(n) + c_4\text{.}\) BinarySearch
has a running time expressed by the function
\begin{equation*}
T(n)=(c_1 + c_2 +c_3)\log_2(n) + c_4 \text{.}
\end{equation*}
The Big O can then be calculated as follows:
\begin{equation*}
\begin{array}{c c}
(c_1 + c_2 +c_3)\log_2(n) + c_4 & (c_1 + c_2 +c_3)\log_2(n) + c_4 \log_2(n) \\
& \leq (c_1 + c_2+ c_3 + c_4)\log_2(n) \\
& = O(\log_2(n))
\end{array}
\end{equation*}
The asymptotic complexity and running time of
BinarySearch
is therefore
\(\log_2(n)\text{.}\) As can be seen in
Figure 8.6.7, this is a far faster algorithm than
\(O(n)\) LinearSearch
.
The recursive version of BinarySearch
Algorithm 9.2.3 has the same complexity as the iterative version:
\(\log_2(n)\text{.}\) The only difference is that instead of splitting the list in half and looping, it splits the list in half and makes a recursive call to itself.
Example 9.3.6. Complexity of the MergeSort Algorithm.
We now analyze the running time of algorithm
MergeSort
,
Algorithm 9.2.8. It follows from the pseudocode that, when running this algorithm together with its recursive calls, several calls are made to algorithm
Merge
,
Algorithm 9.2.9. What we need to count is the total number of comparisons that are made. That is, we will determine the total number of times that the line “if
\(x \leq y\)” in algorithm
Merge
is executed when running algorithm
MergeSort(L,n)
.
We first observe that the number of comparisons made by algorithm Merge(L1,L2)
is at most \(|L_1|+|L_2|\text{.}\)
Let \(n\) be an integer and assume for simplicity that \(n\) is a power of two, i.e., \(n=2^k\) for some integer \(k \geq 0\text{.}\) We define \(T(n)\) to be the maximum number of comparisons made when running algorithm MergeSort(L,n)
on any input list \(L\) of \(n\) numbers. Note that we include in \(T(n)\) all comparisons that are made during all calls to Merge
that are part of all recursive calls that are generated when running MergeSort(L,n)
.
Consider a list \(L\) of \(n\) numbers, where \(n\) is a power of two. For \(n=1\text{,}\) it follows from the pseudocode for MergeSort(L,n)
that T(1) = 0. Assume that \(n \geq 2\) and consider again the pseudocode for MergeSort(L,n)
. Which parts of the algorithm make comparisons between input elements?
The call MergeSort(L1,m)
is a recursive call on a list of \(m=n/2\) numbers. By definition, the total number of comparisons made in this call (together with all its recursive subcalls) is at most \(T(n/2)\text{.}\)
The call MergeSort(L2,n-m)
is a recursive call on a list of \(n-m=n/2\) numbers. By definition, the total number of comparisons made in this call (together with all its recursive subcalls) is at most \(T(n/2)\text{.}\)
Finally, algorithm MergeSort(L,n)
calls the non-recursive algorithm Merge(L1,L2)
. We have seen above that the number of comparisons made in this call is at most \(|L_1|+|L_2| = n\text{.}\)
By adding the number of comparisons, we get
\begin{equation*}
T(n) \leq T(n/2) + T(n/2) + n = 2 \cdot T(n/2) + n .
\end{equation*}
Thus, we obtain the following recurrence relation:
\begin{equation*}
\begin{array}{c c}
T(1) & = 0 , \nonumber \\
T(n) & \leq 2 \cdot T(n/2) + n \\
&
\end{array}
\end{equation*}
Our goal was to determine \(T(n)\text{,}\) but at this moment, we only have a recurrence relation for this function. We will solve this recurrence relation using a technique called unfolding:
Recall that we assume that \(n=2^k\) for some integer \(k \geq 0\text{.}\) We furthermore assume that \(n\) is a large integer. We know from above that
\begin{equation*}
T(n) \leq 2 \cdot T(n/2) + n .
\end{equation*}
If we replace \(n\) by \(n/2\text{,}\) which is a valid thing to do, we get
\begin{equation*}
T(n/2) \leq 2 \cdot T(n/2^2) + n/2 .
\end{equation*}
By combining these two inequalities, we get
\begin{equation*}
\begin{array}{c c c}
T(n) & \leq & 2 \cdot T(n/2) + n \\
& \leq & 2 \left( 2 \cdot T(n/2^2) + n/2 \right) + n \\
& = & 2^2 \cdot T(n/2^2) + 2n .
\end{array}
\end{equation*}
Let us repeat this: Replacing \(n\) by \(n/2^2\) the recurrence equation gives
\begin{equation*}
T(n/2^2) \leq 2 \cdot T(n/2^3) + n/2^2 .
\end{equation*}
By substituting this into the inequality for \(T(n)\text{,}\) we get
\begin{equation*}
\begin{array}{c c c}
T(n) & \leq & 2^2 \cdot T(n/2^2) + 2n \\
& \leq & 2^2 \left( 2 \cdot T(n/2^3) + n/2^2 \right) + 2n \\
& = & 2^3 \cdot T(n/2^3) + 3n .
\end{array}
\end{equation*}
In the next step, we replace \(n\) by \(n/2^3\) in the recurrence equation, which gives
\begin{equation*}
T(n/2^3) \leq 2 \cdot T(n/2^4) + n/2^3 .
\end{equation*}
By substituting this into the inequality for \(T(n)\text{,}\) we get
\begin{equation*}
\begin{array}{c c c}
T(n) & \leq & 2^3 \cdot T(n/2^3) + 3n \\
& \leq & 2^3 \left( 2 \cdot T(n/2^4) + n/2^3 \right) + 3n \\
& = & 2^4 \cdot T(n/2^4) + 4n .
\end{array}
\end{equation*}
At this moment, you will see the pattern and, at the end, we get the inequality
\begin{equation*}
T(n) \leq 2^k \cdot T(n/2^k) + kn .
\end{equation*}
Since \(n=2^k\text{,}\) we have \(T(n/2^k) = T(1)\text{,}\) which is \(0\) from the base case of the recurrence relation. Also, \(n=2^k\) implies that \(k = \log n\text{.}\) We conclude that
\begin{equation*}
T(n) \leq n \cdot T(1) + n \log n = n \log n .
\end{equation*}
We thus have solved the recurrence relation. In case you have doubts about the validity of the unfolding method, we verify by induction that indeed
\begin{equation*}
T(n) \leq n \log n , n= 2^k.
\end{equation*}
The base case is when \(n=1\text{.}\) In this case, we have \(T(1)=0\) and \(1 \log 1 = 1 \cdot 0 = 0\text{.}\) Let \(n \geq 2\) be a power of \(2\) and assume that
\begin{equation*}
T(n/2) \leq (n/2) \log (n/2) .
\end{equation*}
From the recurrence relation, we get
\begin{equation*}
T(n) \leq 2 \cdot T(n/2) + n .
\end{equation*}
By substituting the induction hypothesis into this inequality, we get
\begin{equation*}
\begin{array}{c c c}
T(n) & \leq & 2 \cdot (n/2) \log (n/2) + n \\
& = & n \log (n/2) + n \\
& = & n \left( \log n - \log 2 \right) + n \\
& = & n \left( \log n - 1 \right) + n \\
& = & n \log n .
\end{array}
\end{equation*}
Thus, by induction, \(T(n) \leq n \log n\) for any integer \(n\) that is a power of \(2\text{.}\)
Until now, we have only counted the number of comparisons made by algorithm MergeSort
. It follows from the pseudocode that the total running time, i.e., the total number of “elementary” steps, is within a constant factor of the total number of comparisons. Therefore, if \(n\) is a power of \(2\text{,}\) the running time of algorithm MergeSort(L,n)
is \(O(n \log n)\text{.}\)
For general values of \(n\text{,}\) the recurrence relation for the number of comparisons becomes the following:
\begin{equation*}
\begin{array}{lcl}
T(n) & = & 0 , \mbox{ if or } \\
T(n) & \leq & T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil)
+ n , \mbox{ if }
\end{array}
\end{equation*}
It can be shown by induction that this recurrence relation solves to \(T(n) = O(n \log n)\text{.}\) We have proved the following result:
Theorem 9.3.7. Big O of MergeSort.
For any list \(L\) of \(n\) numbers, the running time of algorithm MergeSort(L,n)
is \(O(n \log n)\text{.}\)