Skip to main content

Section 9.3 Algorithmic Complexity

The most common way of ranking different algorithms for the same problem is by how quickly they run. Ideally, we want the fastest possible algorithm for any particular problem. But how do we measure running time?

As a specific example, how long does it take to sing Algorithm 9.1.1 the song BottlesOfBeer(n)? This is obviously a function of the input value n, but it also depends on how quickly you can sing. Some singers might take ten seconds to sing a verse; others might take twenty. Technology widens the possibilities even further. Dictating the song over a telegraph using Morse code might take a full minute per verse. Downloading an mp3 over the Web might take a tenth of a second per verse. Duplicating the mp3 in a computer’s main memory might take only a few microseconds per verse.

What’s important here is how the singing time changes as n grows. Singing BottlesOfBeer(2n) requires about twice much time as singing BottlesOfBeer(n), no matter what technology is being used.

We can measure time by counting how many times the algorithm executes a certain instruction or reaches a certain milestone in the “code”. For example, we might notice that the word “beer” is sung three times in every verse of BottlesOfBeer, so the number of times you sing “beer” is a good indication of the total singing time. For this question, we can give an exact answer: BottlesOfBeer(n) mentions beer exactly \(3n + 3\) times. This is reflected in the asymptotic singing time \(\Theta(n)\)

Subsection 9.3.1 Basic Operations and Input Size

Of primary consideration when estimating an algorithm's performance is the number of basic operations required by the algorithm to process an input of a certain size. The terms "basic operations" and "size" are both rather vague and depend on the algorithm being analyzed. Size is often the number of inputs processed. For example, when singing BottlesOfBeer the size of the input is the number of bottles of beer you start with. When comparing sorting algorithms the size of the problem is typically measured by the number of records to be sorted. A basic operation must have the property that its time to complete does not depend on the particular values of its operands. Adding or comparing two integer variables are examples of basic operations. Summing the contents of an array containing \(n\) integers is not, because the cost depends on the value of \(n\) (i.e., the size of the input).

Because the most important factor affecting running time is normally size of the input, for a given input size \(n\) we often express the time \(T\) to run the algorithm as a function of \(n\text{,}\) written as \(T(n)\text{.}\) We will always assume \(T(n)\) is a non-negative value. We can then determine the order or asymptotic complexity of the algorithm by calculating the Big O of \(T(n)\text{.}\)

Consider the Algorithm 9.1.11 that solves the problem of finding the largest value in an array of \(n\) integers. The algorithm looks at each integer in turn, saving the largest value seen so far. Here, the size of the problem is the number of integers stored in in the list. The basic operation is to compare an integer's value to that of the largest value seen so far. It is reasonable to assume that it takes a fixed amount of time to do one such comparison, regardless of the value of the two integers or their positions in the array.

Let us call \(c\) the amount of time required to compare two integers in function MaximumElement. We do not care right now what the precise value of \(c\) might be. Nor are we concerned with the time required to increment variable index because this must be done for each value in the list, or the time for the actual assignment when a larger value is found, or the little bit of extra time taken to initialize maximum. We just want a reasonable approximation for the time taken to execute the algorithm. The total time to run MaximumElement is therefore approximately \(cn\text{,}\) because we must make \(n\) comparisons, with each comparison costing \(c\) time. We say that MaximumElement (and by extension, the largest-value sequential search algorithm for any typical implementation) has a running time expressed by the function

\begin{equation*} T(n)=cn\text{.} \end{equation*}

The asymptotic complexity of MaximumElement is therefore the Big O of \(T(n)\text{.}\) This is a Linear Complexity algorithm because \(T(n) = O(n)\text{.}\)

The running time of a statement that assigns the first value of an integer array to a variable is simply the time required to copy the value of the first array value. We can assume this assignment takes a constant amount of time regardless of the value. Let us call \(c_1\) the amount of time necessary to copy an integer. No matter how large the array on a typical computer (given reasonable conditions for memory and array size), the time to copy the value from the first position of the array is always \(c_1\text{.}\)Thus, the equation for this algorithm is simply

\begin{equation*} T(n) = c_1 \end{equation*}

indicating that the size of the input \(n\) has no effect on the running time. This is a Constant Complexity algorithm because \(T(n) = O(1)\text{,}\) it has constant running time.

.

Consider the following function written in Python:

What is the running time for this function? We can assume that incrementing sum takes constant time; call this time \(c\text{.}\) (We can ignore the time required to initialize sum, and to increment the loop counters i and j. In practice, these costs can safely be bundled into time \(c\text{.}\)) The total number of increment operations is \(n^2\text{.}\) The inner j loop runs \(n\) times for each of \(n\) values of i, so we have \(n*n = n^2\) operations total. Thus, we say that the running time is \(T(n) = cn^2\) and the algorithm has quadratic complexity: \(T(n) = O(n^2)\text{.}\) Therefore NestedLoops has a quadratic running time.

Subsection 9.3.2 Analyzing Algorithmic Complexity

Analyzing algorithmic complexity or running time can get tricky, but it always follows a few basic rules:

  1. Look for loops in the algorithm.

    • If there are no loops, operations run in constant time, \(O(1)\)
    • If there are loops that run a fixed number of times, that is constant time, \(O(1)\)
    • If there is one loop using a counter that runs from 1 to \(n\) that requires linear time, \(O(n).\)
    • If there is one loop where \(n\) is divided by a constant, that requires logarithmic time, \(O(\log n)\text{.}\)
    • If there are nested loops, where each one has a counter over \(n\text{,}\) that will be an \(O(n^m)\) running time where\(m\)is the number of loops nested.
  2. Look for recursive calls in the algorithm.

    • These work similarly to loops, if 1 is subtracted from \(n\) it is \(O(1)\text{,}\) if \(n\) is divided it will be \(O(1)\text{,}\) and so on.
  3. Nested structures multiply their Big Os to make the overall algorithm Big O.

  4. Structures that happen one after another, the algorithm's Big O will be the largest individual Big O.

Below we will analyze the complexity of several algorithms that were introduced in Section 9.1 and Section 9.2

We now analyze the running time of algorithm MergeSort, Algorithm 9.2.8. It follows from the pseudocode that, when running this algorithm together with its recursive calls, several calls are made to algorithm Merge, Algorithm 9.2.9. What we need to count is the total number of comparisons that are made. That is, we will determine the total number of times that the line “if \(x \leq y\)” in algorithm Merge is executed when running algorithm MergeSort(L,n).

We first observe that the number of comparisons made by algorithm Merge(L1,L2) is at most \(|L_1|+|L_2|\text{.}\)

Let \(n\) be an integer and assume for simplicity that \(n\) is a power of two, i.e., \(n=2^k\) for some integer \(k \geq 0\text{.}\) We define \(T(n)\) to be the maximum number of comparisons made when running algorithm MergeSort(L,n) on any input list \(L\) of \(n\) numbers. Note that we include in \(T(n)\) all comparisons that are made during all calls to Merge that are part of all recursive calls that are generated when running MergeSort(L,n).

Consider a list \(L\) of \(n\) numbers, where \(n\) is a power of two. For \(n=1\text{,}\) it follows from the pseudocode for MergeSort(L,n) that T(1) = 0. Assume that \(n \geq 2\) and consider again the pseudocode for MergeSort(L,n). Which parts of the algorithm make comparisons between input elements?

  • The call MergeSort(L1,m) is a recursive call on a list of \(m=n/2\) numbers. By definition, the total number of comparisons made in this call (together with all its recursive subcalls) is at most \(T(n/2)\text{.}\)

  • The call MergeSort(L2,n-m) is a recursive call on a list of \(n-m=n/2\) numbers. By definition, the total number of comparisons made in this call (together with all its recursive subcalls) is at most \(T(n/2)\text{.}\)

  • Finally, algorithm MergeSort(L,n) calls the non-recursive algorithm Merge(L1,L2). We have seen above that the number of comparisons made in this call is at most \(|L_1|+|L_2| = n\text{.}\)

By adding the number of comparisons, we get

\begin{equation*} T(n) \leq T(n/2) + T(n/2) + n = 2 \cdot T(n/2) + n . \end{equation*}

Thus, we obtain the following recurrence relation:

\begin{equation*} \begin{array}{c c} T(1) & = 0 , \nonumber \\ T(n) & \leq 2 \cdot T(n/2) + n \\ &} \end{array} \end{equation*}

Our goal was to determine \(T(n)\text{,}\) but at this moment, we only have a recurrence relation for this function. We will solve this recurrence relation using a technique called unfolding:

Recall that we assume that \(n=2^k\) for some integer \(k \geq 0\text{.}\) We furthermore assume that \(n\) is a large integer. We know from above that

\begin{equation*} T(n) \leq 2 \cdot T(n/2) + n . \end{equation*}

If we replace \(n\) by \(n/2\text{,}\) which is a valid thing to do, we get

\begin{equation*} T(n/2) \leq 2 \cdot T(n/2^2) + n/2 . \end{equation*}

By combining these two inequalities, we get

\begin{equation*} \begin{array}{c c c} T(n) & \leq & 2 \cdot T(n/2) + n \\ & \leq & 2 \left( 2 \cdot T(n/2^2) + n/2 \right) + n \\ & = & 2^2 \cdot T(n/2^2) + 2n . \end{array} \end{equation*}

Let us repeat this: Replacing \(n\) by \(n/2^2\) the recurrence equation gives

\begin{equation*} T(n/2^2) \leq 2 \cdot T(n/2^3) + n/2^2 . \end{equation*}

By substituting this into the inequality for \(T(n)\text{,}\) we get

\begin{equation*} \begin{array}{c c c} T(n) & \leq & 2^2 \cdot T(n/2^2) + 2n \\ & \leq & 2^2 \left( 2 \cdot T(n/2^3) + n/2^2 \right) + 2n \\ & = & 2^3 \cdot T(n/2^3) + 3n . \end{array} \end{equation*}

In the next step, we replace \(n\) by \(n/2^3\) in the recurrence equation, which gives

\begin{equation*} T(n/2^3) \leq 2 \cdot T(n/2^4) + n/2^3 . \end{equation*}

By substituting this into the inequality for \(T(n)\text{,}\) we get

\begin{equation*} \begin{array}{c c c} T(n) & \leq & 2^3 \cdot T(n/2^3) + 3n \\ & \leq & 2^3 \left( 2 \cdot T(n/2^4) + n/2^3 \right) + 3n \\ & = & 2^4 \cdot T(n/2^4) + 4n . \end{array} \end{equation*}

At this moment, you will see the pattern and, at the end, we get the inequality

\begin{equation*} T(n) \leq 2^k \cdot T(n/2^k) + kn . \end{equation*}

Since \(n=2^k\text{,}\) we have \(T(n/2^k) = T(1)\text{,}\) which is \(0\) from the base case of the recurrence relation. Also, \(n=2^k\) implies that \(k = \log n\text{.}\) We conclude that

\begin{equation*} T(n) \leq n \cdot T(1) + n \log n = n \log n . \end{equation*}

We thus have solved the recurrence relation. In case you have doubts about the validity of the unfolding method, we verify by induction that indeed

\begin{equation*} T(n) \leq n \log n , n= 2^k. \end{equation*}

The base case is when \(n=1\text{.}\) In this case, we have \(T(1)=0\) and \(1 \log 1 = 1 \cdot 0 = 0\text{.}\) Let \(n \geq 2\) be a power of \(2\) and assume that

\begin{equation*} T(n/2) \leq (n/2) \log (n/2) . \end{equation*}

From the recurrence relation, we get

\begin{equation*} T(n) \leq 2 \cdot T(n/2) + n . \end{equation*}

By substituting the induction hypothesis into this inequality, we get

\begin{equation*} \begin{array}{c c c} T(n) & \leq & 2 \cdot (n/2) \log (n/2) + n \\ & = & n \log (n/2) + n \\ & = & n \left( \log n - \log 2 \right) + n \\ & = & n \left( \log n - 1 \right) + n \\ & = & n \log n . \end{array} \end{equation*}

Thus, by induction, \(T(n) \leq n \log n\) for any integer \(n\) that is a power of \(2\text{.}\)

Until now, we have only counted the number of comparisons made by algorithm MergeSort. It follows from the pseudocode that the total running time, i.e., the total number of “elementary” steps, is within a constant factor of the total number of comparisons. Therefore, if \(n\) is a power of \(2\text{,}\) the running time of algorithm MergeSort(L,n) is \(O(n \log n)\text{.}\)

For general values of \(n\text{,}\) the recurrence relation for the number of comparisons becomes the following:

\begin{equation*} \begin{array}{lcl} T(n) & = & 0 , \mbox{ if or } \\ T(n) & \leq & T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + n , \mbox{ if } \end{array} \end{equation*}

It can be shown by induction that this recurrence relation solves to \(T(n) = O(n \log n)\text{.}\) We have proved the following result:

Exercises 9.3.3 Exercises for Section 9.3

1.

Show that BubbleSort, Algorithm 9.1.13, has \(O(n^2)\) complexity.

Answer

The outer \(i\) loop runs \(n\) times.

The number of inner \(j\) loops varies with \(i\text{,}\) each time it makes \(n - (i + 1)\) loops. If \(i = 0\) it makes \(n-1\) loops, if \(i = 1\) it makes \(n-2\) loops, ..., down to \(n-n = 0\) loops. These two loops correspond to the sum of the first \(n-1\) integers which is equal to \(\frac{(n-1)n}{2}\text{.}\)

Inside the \(j\) loop, there is a comparison and the “swap” does three assignments, making 4 basic operations.

So we have \(4 * \frac{(n-1)n}{2}= \frac{4n^2 - 4n}{2} = 2n^2 - 2n\) which is a polynomial with \(n^2\) as the highest order term. Therefore, BubbleSort is \(O(n^2)\)

2.

Show that the greedy algorithm for making change, Algorithm 9.1.15, has \(O(x)\) complexity in terms of the number of comparisons made. Where \(x\) is the amount of change to be given.

3.

This is the pseudocode for Selection Sort

  1. Trace the Selection Sort on the list: 6, 3, 9, 2, 8. Show all steps, all value changes of \(i,j, \) and \(minLoc\) and the list at end of each \(j\) loop.

  2. Determine the worst-case time complexity of Selection Sort. Explain your answer, don't just write what the big-O is.

Answer
    • \(n = 5\)
    • \(j = 1\)
    • \(minLoc = 1\)
    • \(i = 2\)
    • Is 3 less than 6? Yes:\(minLoc = 2\)
    • \(i = 3\)
    • Is 9 less than 3? No
    • \(i = 4\)
    • Is 2 less than 3? Yes: \(minLoc = 4\)
    • \(i = 5 \)
    • Is 8 less than 2? No
    • Done with i loop
    • \(minLoc \neq j\)
    • Swap 6 and 2
    • New list: 2, 3, 9, 6, 8
    • \(j = 2\)
    • \(minLoc = 2\)
    • \(i = 3\)
    • Is 9 less than 3? No
    • \(i = 4\)
    • Is 6 less than 3? No
    • \(i = 5\)
    • Is 8 less than 3? No
    • Done with i loop
    • \(minLoc = j\)
    • No swap
    • New list: 2, 3, 9, 6, 8
    • \(j = 3\)
    • \(minLoc = 3\)
    • \(i = 4\)
    • Is 6 less than 9? Yes: \(minLoc = 4\)
    • \(i = 5\)
    • Is 8 less than 6? No
    • Done with i loop
    • \(minLoc \neq j\)
    • Swap 9 and 6
    • New list: 2, 3, 6, 9, 8
    • \(j = 4\)
    • \(minLoc = 4\)
    • \(i = 5\)
    • Is 8 less than 9? Yes: \(minLoc = 5minLoc = 5\)
    • Done with i loop
    • \(minLoc \neq j\)
    • Swap 9 and 8
    • New list: 2, 3, 6, 8, 9
    • \(j = 5\) done.
  1. Selection Sort is \(O(n^2)\) because the \(j\) loop runs \(n-1\) times, and the \(i\) loop inside that runs in worst case \(n-1\) times.

    \begin{equation*} (n-1)(n-1) = n^2 - 2n + 1 = O(n^2) \end{equation*}
4.

Determine the worst-case time complexity of Insertion Sort, Algorithm 9.1.14. Explain your answer, don't just write what the big-O is.

5.

Determine the worst-case time complexity of Recursive Linear Search, Algorithm 9.2.3. Explain your answer, don't just write what the big-O is.

Answer

Each call to the recursive function RecursiveLinearSearch runs in constant time. There are no loops only two comparisons. We are only concerned about the worst case for Big O, which is when the target is not in the list. The location variable controls the number of recursive calls that happen. This starts at 1 and increases by 1 in every recursive call. It stops when the base case is reached when \(loc = n + 1.\) That means the number of recursive calls that are made is \(n + 1 - 1 = n\text{.}\) The total number of operations is then: \(2n\text{.}\) Therefore the algorithm is \(O(n)\)

6.

Determine the worst-case time complexity of Recursive Binary Search, Algorithm 9.2.5. Explain your answer, don't just write what the big-O is.