An introduction
It is almost useless to write a program if it doesn’t give you the result in time (Almost 🤔?? Lets face it.. Sometimes there is no choice but to use a bad algorithm till we get a good one 🤓 ). Hence, algorithm performance has huge importance in programming. We have to continuously analyze, adapt and optimize our programs even for the worst inputs.
The Big O notation is a very handy tool to analyze algorithms and see how it performs in worst case of the inputs. It’s a pretty basic concept that helps to compare the growth rate of algorithms and choose the right one for the job. Or in other words, it helps to understand how an algorithm performs for large inputs.
Sometimes, it’s really confusing in the beginning. But I believe answering the following questions would be nice start to it.
- What does Big O notation for time complexity really mean? And why do we need it?
- Why is O(n2) worse than O(n)? And is it the case always?
- How can we accept/reject an algorithm given the O-notation and the size of the input?
Here, we will try and answer these questions in a simple and practical way.
The Big O notation for the time complexity will give us an idea on how our algorithm’s running time will behave for large inputs(Why large inputs? Why not small inputs? We will find the answer later in this post). This information is essential to accept or reject an algorithm for a given size of input.
Getting started
The definition
The formal definition of Big O is,
For a given function g(n), we denote by O(g(n)) (pronounced “big-oh of g of n”) the set of functions:
O(g(n))= { f(n) : there exist positive constants c and n0 such that 0≤f(n)≤c∗g(n) for all n≥n0 }
Note that the definition does not talk anything about algorithms or programs. That’s because this is not limited to algorithms. This concept talk about any function’s growth rate. But we will focus only on the functions for algorithms execution time.
Okay, but what does it mean? 🤔
f(n) = O(g(n))
In one sentence, after some point n0 for the input size, g(n) will always grow at a larger rate than f(n).
Let’s focus on what does that mean for algorithms for algorithms.
The purpose of Big O notation is to study the growth of functions.
Suppose, we have a function f(n) to calculate the execution time of a given algorithm; where n is the size of the input. When we write f(n) = O(g(n)), what we essentially say is that, after n = n0, we can safely say that f(n) will not grow faster than g(n).
So if you know g(n) that means you know the maximum rate at which your algorithm execution’s time will grow for any increase in input size past n0. This upper bound will help you to
- decide if the algorithm can run in a reasonable time given the size of the input.
- compare two algorithms for the same job; normally the one with the best Big O wins.
Assume that you found out that g(n) = n! for your algorithm; and the input size can be as large as 1 million, say an array of numbers. You don’t need to run your algorithm against an array of size 1 million to say that this will not work. Think about it.. With this growth rate, for 1M items, your algorithm may run 1000000! operations. It’s a huge number and your algorithm may not return.
Let’s understand the concept of growth rate upper bound with an example
Let say you created an algorithm that takes an array of size n as input and does some operations on it.
If the function took an execution time of 3n to complete the job (Or in other words 3n basic operations like add, subtract, assignment etc. are done by the algorithm),
Then, f(n) = 3n
Now, consider another function g(n) = n! (n factorial)
For small values of n, 3n is bigger that n!
But n! catches up fast and after n = 7, it overtakes 3n . See the table below
| n | f(n) = 3n | g(n) = n! |
| 1 | 1 | 1 |
| 2 | 9 | 2 |
| 3 | 27 | 6 |
| 4 | 81 | 24 |
| 5 | 243 | 120 |
| 6 | 729 | 720 |
| 7 | 2187 | 5040 |
| 8 | 6561 | 40320 |
| 9 | 19683 | 362880 |
Notice that from n = 7 onwards, n! overtakes 3n with a big margin. This is what we mentioned earlier. g(n) has a bigger growth rate than f(n). Or by Big O notation,
3n = O(n!)
Now consider you created another algorithm for the same purpose for which execution time is n!
You can now, without doubt, say that my first algorithm works better for larger inputs.
This is not the only Big O notation for 3n
3n = O(3n)
3n = O(3n!)
3n = O(nn!)
All of these holds true as per the definition of Big O (Just a matter of finding a suitable value for n0 and c).
But we are interested in the function with the lowest growth rate for g(n) because it is more useful when used for analysis. Think about it! It is more correct to think of 3n as O(3n) than O(n!) when comparing algorithms for better runtime.
But what about those f(n) which has more than one terms in it? There are two more things to do.
1. Ignore the lower order terms since their contribution to the growth of the function becomes negligible for big value of input sizes.
Consider f(n) = n2 + n + 5
when n = 100, f(n) = 1002 + 100 + 5 = 10000 + 105
We can ignore that 105 because its contribution to the growth is negligible.
Thus it is okay to say that, O(n2 + n + 5) = O(n2), since we are focusing on the growth rate rather than the actual value. Thus,
f(n) = O(n2)
similarly
if f(n) = n! + 3n then f(n) = O(n!). Why? n! contributes the most to the growth and 3n becomes negligible after say n = 7
2. Don’t carry around the constant multiplicands because we don’t need to worry about them.
The constant term in the definition of Big O notation is there to let us stop worry about the small multiplicative constants. Our focus is the growth rate.
f(n) = 2n2 = O(n2) = O(2n2) = O(3n2) = O(4n2)
The equality given above holds true. You can choose any value for c as long as you can find an n0. So we do not bother about the multiplicative constants in Big O notation and simply write that
f(n) = 2n2 = O(n2)
And that’s it!! You now have the basics to write the Big O notation.
Applying it in your algorithm
We are going to practice to finding the execution time of algorithms and Big O analysis with the 3 examples below. Keep in mind! When we say that, the execution time function, f(n) = n2, it means that when an input of size n is given, the algorithm performs n2 basic operations like addition, subtraction, multiplication, assignment etc. Similarly f(n) = 100n means for an input size n, the function will perform 100n basic operations and so on..
The execution time of those basic functions might be different in different computers with different processing power. But we don’t care about that. An algorithm must be evaluated by how many operations it asks the computer to do. Not how fast the computer can run instructions
Here are the three examples. They are all written in Golang. I have tried to write the code so that you understand the logic even if you don’t understand the language.
1. Add two numbers
func add(num1 int, num2 int) int {
sum := num1 + num2
return sum
}
Lets breakdown the lines and find the execution time
sum := num1 + num2
3 things are happening here
- A variable named sum is declared (A Golang thing).
- Addition of num1 and num2.
- Assignment of that result to the variable sum.
All of these are basic operations.
So the number of instructions that the algorithm asks the computer to do is the same irrespective of the size of the input. In other words, the execution time of this line of code is a constant. Lets call it Csum (Again we are not interested in the absolute value for any machines).
return sum
Again, this too is a basic operation and gives a fixed number of instructions to the CPU to execute. Hence, this too has a constant execution time. Lets call it Creturn
Now, obviously, the total execution time of the function is Csum + Creturn. Lets call the total execution time Ctotal. Thus,
Ctotal = Csum + Creturn
Which, again, is a constant. In other words,
no matter how big the inputs are(num1 and num2) the function will run in constant time.
f(n) = Ct = O(1)
Why? Think about it. g(n) = 1 matches Big O definition with n0 = 1 and c = Ct. Also it is the function with the smallest growth rate.
Here f(n) is said to have a constant time complexity and it is denoted as O(1).
2. Sum of array elements
func elemSum(array []int) int {
n := len(array) // Constant time assuming len(array) is constant time function
sum := 0 // Constant time
i := 0 // Constant time
for i < n {
sum = sum + array[i] // Constant time
i = i + 1 // Constant time
}
return sum // Constant time
}
By now, you are already familiar with basic operations and its constant time executions. So I am not going to talk about those lines that are marked with the comment “Constant time“.
Here, the point of interest is the for loop which iterates for n times. Remember we defined the the execution time as the number of basic instructions executed by the computer.
Lets calculate the execution time for this algorithm.
The variables
Cinit = sum of execution time of the statements above the for loop. Its a constant obviously.
Cloop = sum of execution time of the statements inside the for loop. It too is constant to run once.
Creturn = execution time for the return statement. It is a constant.
Since the loop executes for n times, the constant time statements inside it also gets executed for n times.
Thus,
f(n) = Cinit + nCloop + Creturn
= nCloop + 1Cinit + 1Creturn
= O(n)
Note how we applied the removal of lower order terms and constant multiplications as discussed earlier to get the Big O notation.
3. Sum of array elements: the crazy way(an unnecessary loop to show its impact in the execution time and complexity)
func elemSum(array []int) int {
n := len(array) // Constant time assuming len(array) is constant time function
sum := 0 // Constant time
i := 0 // Constant time
for i < n {
j := 0 // Constant time
for j < n {
if i == j { // Constant time to check equality
sum = sum + array[i] // Constant time
j = j + 1 // Constant time
}
}
i = i + 1 // Constant time
}
return sum // Constant time
}
Again, you already know the constant time operations. Lets move on to the two loops and start the calculations.
The variables
Cinit = sum of executions times for the lines above the first for loop
Cjinit = execution time for the statement j := 0
Cif = execution time for the if condition inside the inner for loop
Ci==j = sum of execution times for the statements inside the if condition inside the inner for loop. Note that this set of statements gets executed only once for every n iterations of the inner loop when i becomes equal to j. So to match this case, this term will be multiplied by 1 / n while we calculate the execution time
Ciinc = execution time for the increment operation of i
Creturn = execution time for the return statement
Thus,
Note: Do not get intimidated by this big equation. Compare it with the program we wrote above and it will make sense.
f(n) = Cinit + n (Cjinit + n (Cif + (1 / n)Ci==j ) + Ciinc) + Creturn
= Cinit + nCjinit + n2(Cif + (1 / n) Ci==j ) + nCiinc + Creturn
= Cinit + nCjinit + n2Cif + nCi==j + nCiinc + Creturn
= n2Cif + n (Cjinit + Ci==j + Ciinc) + 1(Cinit + Creturn)
= O(n2)
Again, note how we applied the removal of lower order terms and constant multiplications as discussed earlier to get the Big O notation.
Congratulations on reaching this point!! Now you are ready for the answers for the questions we asked ourselves in the introduction.
The questions from the introduction
What does Big O notation for time complexity really mean? And why do we need it?
It is a simple and intuitive representation of the maximum growth rate of an algorithm; used to simplify the analysis to see how the algorithm will behave for large inputs.
To make it more clear; if you have a function like the one given below
func outerFn() {
innerFn() // func2 runs with execution time n3 + 6n2 + 5
// Some code here that runs with execution time n4 + 7n2
}
While analyzing the execution time for outerFn() it is more convenient to do it in Big O notation than in its actual time values.
Think about it
Absolute execution time values
f(n) = n4 + 7n2 + n3 + 6n2 + 5
Big O notation(Notice how the Big O notation of innerFn() have been used instead of it's execution time)
f(n) = n4 + 7n2 + O(n3)
= O(n4)
The Big O notation clearly tells you to watch out for the inputs for which n4 can be considered as too much. You can be less concerned about the n3 or n2 terms from the first equation since their contribution to the execution time will be negligible when n4 is there. In other words, n4 is the real problem and Big O tells you to focus on that.
Also, think about a function that invokes so many other functions such as innerFn(). The analysis based on actual execution time becomes complex when consider the execution times of all of those functions. It is more convenient to use the Big O notation of those inner invocations like we did here.
For example, write f(n) as
f(n) = O(n) + O(n * log n) + O(n2)
= O(n2)
Why is O(n2) worse than O(n)? And is it the case always?
Like we already discussed, Big O notation is interested in the performance of the algorithm in its worst case and for large values of n. There, n2 always is bigger than n.
However and absolute execution time of n2 might not be worse than n for small values (eg: n = 0, n = 1)
We have already seen this behavior for 3n and n!
O(n!) is worse than O(3n). But absolute time n! is better than absolute time 3n for n = 1, 2, 3, 4, 5, 6. See the table “Growth comparison of 3n vs n!“.
How can we accept/reject an algorithm given the O-notation and the size of the input?
If you have a choice on the algorithm, and you may need to process a big amount of data, definitely go for the algorithm with the smallest worst case time complexity (Or in other words best Big O value). In most cases, you can choose that best Big O algorithm if you are not sure about the size of the input. Because for big values, it is definitely better and for small values, you may not notice the difference in the execution times of the algorithms. Like in the case of O(3n) and O(n!). Going for O(3n) wont hurt you much even for small values of n.
Given below is a rough view of acceptable worst case complexities for different input sizes from HackerRank
| Length of input | Worst accepted algorithm |
| ≤ [10..11] | O(n!), O(n6) |
| ≤ [15..18] | O(2n * n2) |
| ≤ [18..22] | O(2n * n) |
| ≤ 100 | O(n4) |
| ≤ 400 | O(n3) |
| ≤ 2K | O(n2 * log n) |
| ≤ 10K | O(n2) |
| ≤ 1M | O(n * log n) |
| ≤ 100M | O(n), O(log n), O(1) |
And as always, there is no “one size fits all” solution for algorithms. Choose the solution that is best for your situation. Keep Big O as a handy tool.