DefinitionTime Complexity

The time complexity describes how the running time of the algorithm grows as the size of the input increases. It can be expressed using different notations given varied purposes, which provides an upper bound, lower bound or average performance of the algorithm.

Mathematically, time complexity can be defined as a function T:A×nF(n)T: A \times n \to F(n), where AA is the set of all algorithms1, nn is the input size, and F(n)F(n) is the set of possible functions with nn as input, normally, F(n)={1,nk,kn,logn(x),ln(x),nlogn(x),n!}F(n) = \{1, n^k, k^n, \log_n (x), \ln(x), n \log_n (x), n!\}, where kNk \in \mathbb{N}, nn is the input size of the algorithm. We have time complexity T(a)=f(n)T(a) = f(n) for aAa \in A and f(n)F(n)f(n) \in F(n) when the input size of aa is nn.

This could be a bit abstract, but this is not really the common time complexity definition we use in practice. In real practice, we add another layer of abstraction to the time complexity, benchmarking algorithms in different aspects, such as the worst-case, best-case, and average-case time complexity. We will discuss them in the following sections.

Let’s consider the example of linear search, a simple algorithm that searches for a target element in an array. The pseudocode of the linear search algorithm is shown below. We simply designate a target element and traverse the array from the beginning to the end, comparing each element with the target. If the target is found, we return the index of the element; otherwise, we return -1 to indicate that the target is not in the array. Now we will analyze the time complexity of the linear search algorithm.

Algorithm 1 Linear Search

Require: array AA, target xx

Ensure: index of xx or 1-1

1:for i0i \gets 0 to n1n - 1 do

2:if A[i]=xA[i] = x then

3:return ii // target found at index ii

4:end if

5:end for

6:return 1-1 // target not found

ExampleLinear Search Input Size

To figure out the time complexity of the linear search, we first need to determine which metadata decides the input size of the algorithm. Obviously, in this algorithm we only traverse the array linearly, which means the input size of the algorithm is the length of the array. Therefore, we denote the input size as nn, and the time complexity of the linear search algorithm is nn, where nn is the length of the array.

Now we look at a slightly different example, which is also related to Linear Search.

def linear_search(arr: list[int], target: int) -> int:
"""
Linear search algorithm to find the target element in the array.
"""
for i in range(len(arr)):
if arr[i] == target:
return i # Target found at index i
return -1 # Target not found in the array
target = 5
target_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(linear_search(target_list, target)) # Output: 4
ProblemLinear Search on a Fixed Input

What is the time complexity of the linear search algorithm in this call?

The answer is constant function, f(n)=1f(n)=1, but not linear function, f(n)=nf(n)=n. The reason is that the length of the array is fixed here, instead of being a variable. Therefore, the time complexity of the linear search algorithm in this call is constant. The same goes for a given array of length 100,000, because the length of the array is fixed. This example may confuse some people, but no worries, we will explicitly discuss the worst-case, best-case, and average-case time complexity in the following sections using linear search as an example.

Worst-case Time Complexity (Big-O Notation)

As we said, when we evaluate the time complexity of an algorithm, we usually take certain considerations, and the most common one is how bad the algorithm can perform. This is called the worst-case time complexity, which provides an upper bound on the running time of the algorithm. We use big-O notation to represent the worst-case time complexity, for example, O(n)O(n), O(n2)O(n^2), O(logn)O(\log n), etc.

DefinitionBig-O Notation

By assuming the input of an algorithm is of variable size nn, we use big-O notation to analyse the asymptotic upper bound of the time complexity when nn \to \infty. We state that, in the worst case, the time complexity T(n)T(n) of an algorithm with input size nn has T(n)=O(g(n))T(n) = O(g(n)), if there exist constants c>0c > 0 and rNr \in \mathbb{N} such that for all n>rn > r, 0f(n)cg(n)0 \le f(n) \le c \cdot g(n). where f(n)f(n) is the exact time complexity of the algorithm, and g(n)g(n) is the upper bound of the time complexity.

In natural language, we say that the time complexity of the algorithm is O(g(n))O(g(n)) if there exists a constant cc and a positive integer rr such that, for all input sizes nn greater than rr, the time complexity of the algorithm is bounded above by cg(n)c \cdot g(n).

Still, we just use linear search as an example, though it’s quite special, because its actual worst-case time complexity is exactly the same as the big-O notation, while this does not always happen in other algorithms. We will see more examples in the future.

ExampleLinear Search Worst-case Complexity

For the linear search algorithm, the worst-case time complexity is O(n)O(n), where nn is the length of the array. This is because, we have f(n)=nf(n) = n, and g(n)=ng(n) = n, when c=1c=1, nN\forall n \in \mathbb{N}, 0f(n)1g(n)0 \le f(n) \le 1 \cdot g(n).

In this case, we have more than what we need to know about the worst-case time complexity, because we need only n>r\forall n > r, 0f(n)cg(n)0 \le f(n) \le c \cdot g(n), however we have nN\forall n \in \mathbb{N}, 0f(n)cg(n)0 \le f(n) \le c \cdot g(n). This is decided by the nature of the linear search algorithm, which is quite simple and straightforward. However, again, this is not always the case, and we will see more examples in the future.

Best-case Time Complexity (Big-Omega Notation)

Opposite to the worst-case time complexity, we have the best-case time complexity, which provides a lower bound on the running time of the algorithm. We use big-Omega notation to represent the best-case time complexity, for example, Ω(n)\Omega(n), Ω(n2)\Omega(n^2), Ω(logn)\Omega(\log n), etc.

DefinitionBig-Omega Notation

By assuming the input of an algorithm is of variable size nn, we use big-Omega notation to analyse the asymptotic lower bound of the time complexity when nn \to \infty. We state that, in the best case, the time complexity T(n)T(n) of an algorithm with input size nn has T(n)=Ω(g(n))T(n) = \Omega(g(n)), if there exist constants c>0c > 0 and rNr \in \mathbb{N} such that for all n>rn > r, f(n)cg(n)f(n) \ge c \cdot g(n). where f(n)f(n) is the exact time complexity of the algorithm, and g(n)g(n) is the lower bound of the time complexity.

In natural language, we say that the time complexity of the algorithm is Ω(g(n))\Omega(g(n)) if there exists a constant cc and a positive integer rr such that, for all input sizes nn greater than rr, the time complexity of the algorithm is bounded below by cg(n)c \cdot g(n).

Again, we use linear search as an example to illustrate the best-case time complexity.

ExampleLinear Search Best-case Complexity

For the linear search algorithm, the best-case time complexity is Ω(1)\Omega(1), which means that the best-case time complexity is constant. This is because, we have f(n)=1f(n) = 1, and g(n)=1g(n) = 1, when c=1c=1, nN\forall n \in \mathbb{N}, f(n)1g(n)f(n) \ge 1 \cdot g(n).

Average-case Time Complexity (Big-Theta Notation)

DefinitionAverage Time Complexity

Let I(n)I(n) be the set of all input instances of size nn, and let T(n,x)T(n, x) denote the running time of an algorithm on input xx in I(n)I(n). Suppose there is a probability distribution P(x)P(x) over these inputs for example, the uniform distribution, meaning that for all xI(n)x \in I(n) we have P(x)=1/(I(n)).P(x)=1/(|I(n)|). Then the average running time is defined as Tavg(n)=xI(n)P(x)T(n,x).T_{\text{avg}} (n)= \sum_{x \in I(n)} P(x) \cdot T(n,x). In the case where every input is equally likely, this expression simplifies to Tavg(n)=1/(I(n))xI(n)T(n,x).T_{\text{avg}} (n)= 1/(|I(n)|) \sum_{x \in I(n)}T(n,x).

We say that an algorithm has an average-case time complexity of Θ(g(n))\Theta(g(n)) if there exist positive constants c1c_1 and c2c_2 and a function f(n)f(n) such that, for all sufficiently large nn, the average running time of the algorithm on inputs of size nn is bounded above and below by c1g(n)c_1 \cdot g(n) and c2g(n)c_2 \cdot g(n), respectively. In other words, the average-case time complexity is Θ(g(n))\Theta(g(n)) if there exist constants c1c_1, c2c_2, and n0n_0 such that, for all nn0n \ge n_0, we have c1g(n)Tavg(n)c2g(n)c_1 \cdot g(n) \le T_{\text{avg}} (n) \le c_2 \cdot g(n).

Average-case analysis is a more realistic measure of an algorithm’s performance, yet it is also more complex to determine than the worst-case or best-case time complexity. It requires knowledge of the probability distribution of inputs and the expected running time of the algorithm on each input.

To make it easier to understand, we can say that average time complexity, in probability theory, is the expected value of the time complexity of the algorithm, given a probability distribution of inputs, which is usually uniform by default.

Now we will discuss the average-case time complexity of the linear search algorithm, as an example for average-case analysis.

ExampleLinear Search Average-case Complexity

For the linear search algorithm, assuming the target element is equally likely to be at any position in the array or not present at all, the average-case time complexity is Θ(n)\Theta(n). If the target is present, the average number of comparisons is (1+2+...+n)/n=(n+1)/2(1+2+...+n)/n = (n+1)/2. If the target is not present, it takes nn comparisons. Assuming each of these n+1n+1 scenarios (target at index 0 to n-1, or not present) is equally likely, the average number of comparisons is [n(n+1)/2+n2]/[n(n+1)][n(n+1)/2 + n^2] / [n(n+1)], which simplifies to a value that is proportional to nn. Thus, Tavg(n)Θ(n)T_{\text{avg}}(n) \in \Theta(n).

Space Complexity

Space complexity is a measure of the amount of working storage an algorithm needs. That is, how much memory, in the worst case, is required at any point in the algorithm. As with time complexity, we are typically concerned with how the space requirements grow as the input size grows, and we use asymptotic notation to describe this growth.

Let S(n)S(n) be the space complexity of an algorithm, where nn is the input size. This includes space for input variables, auxiliary variables, and output variables.

  • Auxiliary Space: This refers to the temporary space used by the algorithm, not including the space taken by the input. For many algorithms, the auxiliary space complexity is more interesting than the total space complexity, especially if the input is given and cannot be modified.
ExampleLinear Search Space Complexity

The linear search algorithm, as presented, has a space complexity of S(n)Θ(1)S(n) \in \Theta(1). This is because it only uses a fixed number of variables (e.g., i, target, and the input array itself, which is considered given). The space required does not grow with the size of the input array nn.

ExampleRecursive Algorithms and Space Complexity

For recursive algorithms, space complexity must account for the space used by the call stack. Consider a recursive implementation of calculating the nn-th Fibonacci number:

def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)

While its time complexity is high, its space complexity is determined by the maximum depth of the recursion tree, which is nn. Therefore, S(n)Θ(n)S(n) \in \Theta(n).

Algorithm Complexity Rigorously Explained2

For readers with a background in mathematical analysis and set theory, we can formalize the concepts of algorithm complexity further.

Formalizing Time and Space

Let A\mathcal{A} be a fixed algorithm. Let I\mathcal{I} be the set of all possible inputs to A\mathcal{A}. For each input xIx \in \mathcal{I}, let x|x| denote its size.

  • A cost function C:IR+C: \mathcal{I} \to \mathbb{R}^+ can represent either time (number of operations) or space (number of memory cells used).

  • The complexity function for a specific case (worst, best, average) is a function F:NR+F: \mathbb{N} \to \mathbb{R}^+.

    • Worst-case complexity: Fworst(n)=supxIx=nC(x)F_{\text{worst}}(n) = \sup_{\substack{x \in \mathcal{I} \\ |x| = n}} C(x)
    • Best-case complexity: Fbest(n)=infxIx=nC(x)F_{\text{best}}(n) = \inf_{\substack{x \in \mathcal{I} \\ |x| = n}} C(x)
    • Average-case complexity: Requires a probability distribution μn\mu_n on the set of inputs {xI:x=n}\{x \in \mathcal{I} : |x| = n\}. Then, Favg(n)=EXμn[C(X)]={x:x=n}C(x)dμn(x)F_{\text{avg}}(n) = \mathbb{E}_{X \sim \mu_n}[C(X)] = \int_{\{x: |x|=n\}} C(x) d\mu_n(x). For a discrete uniform distribution over mnm_n possible inputs of size nn, this becomes Favg(n)=1mnxIx=nC(x)F_{\text{avg}}(n) = \frac{1}{m_n} \sum_{\substack{x \in \mathcal{I} \\ |x| = n}} C(x).

Asymptotic Notation as Set Relations

The notations OO, Ω\Omega, and Θ\Theta describe relationships between sets of functions. Let G\mathcal{G} be the set of all functions g:NR+g: \mathbb{N} \to \mathbb{R}^+.

  • O(g(n))={f(n)c>0,n0N s.t. nn0,0f(n)cg(n)}O(g(n)) = \{ f(n) \mid \exists c>0, n_0 \in \mathbb{N} \text{ s.t. } \forall n \ge n_0, 0 \le f(n) \le c \cdot g(n) \}
  • Ω(g(n))={f(n)c>0,n0N s.t. nn0,0cg(n)f(n)}\Omega(g(n)) = \{ f(n) \mid \exists c>0, n_0 \in \mathbb{N} \text{ s.t. } \forall n \ge n_0, 0 \le c \cdot g(n) \le f(n) \}
  • Θ(g(n))=O(g(n))Ω(g(n))\Theta(g(n)) = O(g(n)) \cap \Omega(g(n))

When we state F(n)O(g(n))F(n) \in O(g(n)), we mean that the function FF (which could be FworstF_{\text{worst}}, FbestF_{\text{best}}, etc.) is an element of the set O(g(n))O(g(n)).

The Role of the Computational Model

The exact cost function C(x)C(x) depends on the underlying computational model (e.g., Random Access Machine (RAM), Turing Machine). For instance:

  • On a RAM, arithmetic operations might take O(1)O(1) time.
  • On a multi-tape Turing Machine, they might take O(logn)O(\log n) time if numbers can be arbitrarily large.

However, for most algorithmic analysis in computer science, the RAM model is implicitly assumed, and the focus is on the number of operations relative to the input size, abstracting away constant factors and lower-order terms. This is why asymptotic analysis is so powerful—it provides a machine-independent way to classify algorithms by their fundamental growth rates.

Footnotes

  1. This is not rigorous if we are using an axiomatic set theory system, while we only use naive set theory to explain the notion.

  2. This section is optional, recommended for those familiar with functions, sets, and mathematical analysis.