In this blog, we’re going to look at fancy cool analysis techniques for data structures. And these are useful for tons of different data structures, especially in the context when you’re using a data structure to implement an algorithm.
For example, when you learn Dijkstra’s algorithm, you have the option of using different heap structures for the priority queue in Dijkstra, and each of them gives a different running time with Dijkstra. But we care more about the total running time of the algorithm, than the individual running time of each operation.
Which is precisely what amortization is about. It’s an analysis technique in computer science where we don’t worry about every single operation’s worst-case cost, and just worry about the total operation, i.e., the sum of all the operations’ cost. But there’s a lot of different ways to do it and we will see four different methods for doing it, and three-ish examples of doing it today.
This is a general technique called the aggregate method, which is by far the simplest and most intuitive method for doing amortization, but probably the weakest one because it may not be able to analyze more complicated algorithms. So in the aggregate method, we do some sequence of operations. Let’s assume, in general, there are k operations. So we formulate the amortized cost per operation as follows:
Let’s talk about the table doubling as an example of the Aggregate Method. We have seen some of this essentially in the context of hash tables. If we want to store n elements in a table of size m, and for this, suppose we use chaining or hashing with chaining, then we get an expected cost.
Though then we are unhappy about our space, we can consider handling a larger table size. One solution is to double m, which is the table size. So, whenever n (number of items) becomes larger than m (table size), we double the size of the table. Now the table’s size doubles to 2m, and we will have to allocate a new array of size 2m and copy over all the items, which involves hashing and will cost O(m). This would happen during one insertion operation, but overall it’s not going to be costly, because you only double log(n) times. But if you look at the total cost of n insertions?
It is at most 2^0 + 2^1 + 2^2 + ··· + 2^[lg n] = Θ(n).
So to do n insertions cost Θ(n) and we can say each insertion has Θ(n)/n = Θ(1) amortized cost, which implies that the amortized cost per operation is constant.
Amortized Bound Definition
Let’s take a look at the general definition of amortized bounds, which is important since you’re dealing with different types of operations.
Amortized cost can be, but does not necessarily have to be, the average cost. We can assign an amortized cost to each operation, such that they “preserve the total cost”
the amortize should always be bigger because we always want an upper bound on our actual cost. So proving that the amortized costs are constant per operation, we get that the sum of the actual cost is constant per operation. But we don’t learn anything about the individual costs, and learn only about the total cost.
Moving on to the next method, the accounting method. So with the accounting method, we define a bank account and an operation that can store credit in that bank account. It’s like the bank teller’s analysis, where the balance must always be non-negative.
With this method, an operation is allowed to store credit into a bank for future use, if its
An operation is also allowed to pay for its extra actual cost using existing credit if its
When we decide to store credit in the bank account, we’ve to pay for it. It’s like consuming time now in order to pay for it in the future. And for the operations costing money, so whenever we perform some operation, say deletion, we spend actual time, say x time. Now if we have x amount in the bank, we can pull those out of the bank, and use that amount to pay for the work, and then the deletion operation itself becomes free in an amortized sense.
And when we do an insertion operation, we physically take some coins out ourselves. That will cost something in the sense that the amortized cost of insertion goes up in order to put those coins in the bank, but then we’ll be able to use them for deletion. So this is what insertion is going to do.
All in all, we can store credit in the bank, and on the other hand, we’ll allow an operation to take coins out of the bank, and in this way, we can pay for time using the credit that’s been stored in the bank. This will be good as long as the bank balance remains non-negative at all times.
In the charging method, operations are allowed to charge costs retroactively to past operations. There’s no such bank balance anymore, although it’s essentially there. We allow the operations to charge some of their cost retroactively to the past, and not the future.
So, we’re not allowed to time travel to the future, and only allowed to go to the past, and charge for the operation. But we’ve got to be a little conservative. We cannot just keep charging the same operation a billion times, because then the cost of that operation is increasing. In the end, every operation had to have paid for its total charge.
Now, when we charge to the past, we get a free amount in the present, but we have to pay for whatever the future is going to do. So we will have to pay for that now on a consistent timeline. We will have to pay for things that come in the future.
This method is even more interesting and in some sense more powerful. It’s the last method on the list, the Potential method. It’s more like the counting strategy. We can think about there being a bank account with some balance, but we’re going to have to define that balance as a function of the data structure state.
We can think of it as kinetic potential. Whenever we have to do an expensive operation, we want this potential to grow large enough so that we can charge that cost to a decrease in the potential. To be precise, this is like storing up energy, and whenever we have some free time, we’ll give some of that time to the potential function.
When we started, we were super excited, because we went from log to log(log). But as we kept on exploring the different methods, we realized we were going from log to constant. It’s a little bit better, but again, these are all small numbers. Nevertheless, we would still like to go fast, as fast as possible.
In real life, many Online algorithms, which can process their input piece-by-piece in a serial fashion, commonly use amortized analysis.
07 — Aniket Ghorpade
14 — Tanishka Gupta
20 — Hasibullah Nuristani
59 — Anirudh Kolwadkar
70 — Jatan Loya