## Asymptotic speedup and data structure interpolation

Suppose you are tasked with designing a data structure consisting of values that can be compared with one another, and that this data structure must support two operations: insertion and lookup. At any moment, you may either be given a value and asked to insert it into your structure, or given a value and asked to say whether it is already present in your structure. The "computational complexity" of these two operations will be evaluated purely in terms of how many comparison operations they carry out on the provided data, so you don't need to worry about memory complexity, time complexity, or anything like that. Think of this problem as being analogous to a balance puzzle.

In this post I'd like to illustrate that an asymptotic speedup phenomenon can occur when a maximum acceptable complexity is imposed for the insertion operation, and the lookup operation is sought to be minimized. To give a specific example, suppose it's been decided that the lookup operation must be $\mathcal O(\log \log n)$ in the number of elements $n$. Then, contrary to intuition, there is no "best" lookup complexity satisfying this constraint. That is, for any possible data structure with an $\mathcal O(\log\log n)$ insertion complexity, there exists a different data structure also having $\mathcal O(\log\log n)$ insertion complexity, but with a strictly asymptotically lower lookup complexity.

### Interpolating between data structures

When designing data structures supporting insertion and lookup operations on ordered data, there's a tradeoff between the complexity of insertion and the complexity of lookup. At one end of the spectrum is an unordered bag, in which insertion requires no comparisons at all, but lookup requires $\Theta(n)$ comparisons since there is no choice but to compare the given element to each element in the bag one at a time. This kind of data structure is as lazy as possible, frontloading none of the work. At the other end of the spectrum is a totally ordered list, in which insertion requires $\Theta(\log_2 n)$ comparisons in the worst case, and lookup also requires $\Theta(\log_2 n)$ comparisons in the worst case, both of which can be implemented using binary search.

On the one hand, insertion cannot get any cheaper than it is for an unordered bag. On the other hand, lookup cannot get any cheaper than it is for a totally ordered list. But can we find data structures that "interpolate" between these two extremes, offering a finer-grained tradeoff between insertion and lookup complexities? Loosely speaking, we might picture some sort of "curve" of different attainable insertion-lookup tradeoffs:

Yes, we can! A trick for accomplishing this is to consider a data structure that is in some sense a "mixture" of an unordered bag and an ordered list. As we introduce new elements into our data structure, we could divide the elements into several ordered lists, with no ordering being maintained between the different lists. We might parametrize this strategy by letting $\ell : \mathbb N\to\mathbb N$ be a monotone increasing function such that $\ell(n+1) - \ell(n) \in \{0,1\}$ for all $n\in\mathbb N$, so that $\ell(n)$ represents the number of lists in the data structure when $n$ elements are present. The constant function $\ell(n) = 1$ would represent the extreme case in which we maintain an ordered list at any given time, and the function $\ell(n) = n$ would represent the other extreme in which each element belongs to its own singleton list, mimicking an unordered bag.

To be more specific, the algorithm for inserting a new element $v$ would be precisely as follows:

1. If $\ell(n+1)$ is greater than the number of ordered lists currently present, create a new singleton list containing $v$.
2. Otherwise, if not all of the lists have the same length, insert $v$ in one of the shortest-length ordered lists.
3. Otherwise, insert $v$ into an arbitrary ordered list.

To lookup an element $v$, we would simply perform binary search on each ordered list, one at a time, until either finding the element or exhausting all of the ordered lists. You might view this approach as employing sequential search across lists and binary search within lists.

Let's calculate the respective worst-case complexities of insertion and lookup in terms of the function $\ell$, which serves as a parameter for our data structure. The above algorithm keeps the lists "balanced" in such a way that the length of the longest list in the data structure is asymptotically bounded by at any given time. (Strictly speaking, this isn't true for any choice of $\ell(n)$. A sufficient condition, however, is for $n/\ell(n)$ to be asymptotic to some monotone increasing sequence.) To insert an element into a list with this many elements, we only need comparisons if we employ binary search. On the other hand, suppose we are looking up an element. We will have no choice but to take a wild guess about which of the $\ell(n)$ lists to search for our element first, and if we get really unlucky, we will have to search all of them. This means that the worst-case complexity of lookup is Hence, we can always achieve the following tradeoff between insertion and lookup complexities:

Now, by choosing a particular value for the parameter sequence $\ell(n)$, we can procure a data structure with $\mathcal O(\log\log n)$ insertion time. In particular, we need to choose $\ell(n)$ in such a way that There are actually many possible choices of $\ell(n)$ that will make this work. If $\ell(n)$ satisfies the following asymptotic identity for some power $p > 0$, then the above will be satisfied: With this growth order for the sequence $\ell(n)$, the complexity of lookup will then be given as follows, in terms of our formula: There's something peculiar about this: by choosing larger and larger values for the power $p$, we obtain a family of strategies with the same worst-case asymptotic insertion complexity of $\mathcal O(\log\log n)$, but more and more efficient lookup complexities. There's no way of choosing $\ell(n)$ here that minimizes the lookup complexity while keeping the insertion complexity $\mathcal O(\log\log n)$, since the above choice of $\ell(n)$ will always do better for a sufficiently large value of $p$. This suggests that our intuitive "tradeoff curve" is qualitatively unrealistic in an important way, and a visualization that is more descriptive of our parametrized data structure might look more like this:

This is the first hint that we have an asymptotic speedup situation on our hands. However, it does not prove that there is no asymptotically optimal solution to our problem in general, since we have only described a particular family of data structures, and we have not ruled out the possibility of some other kind of data structure with $\mathcal O(\log\log n)$ insertion complexity and optimal lookup complexity.

### Posets as states of knowledge

If we want to argue that there is no asymptotically optimal strategy for the problem in question, we will have to show how, for any given strategy, a strictly better one can be found. This is difficult because it's hard to say exactly what a "strategy" is. But we've already simplified things for ourselves a little bit by specifying that "complexity" is measured only in terms of the number of comparison operations carried out. This means that we can make this into less of a problem about data storage, and more of a problem about knowledge or information.

Suppose that you've received some number $n$ of elements, and that you've carried out some comparisons between them already, but perhaps not enough comparisons to know exactly how they are ordered. You might know how some pairs of elements compare to each other, and not know how other pairs of elements compare to each other. Your imperfect "state of knowledge" about the relative ordering of these $n$ elements can be expressed as a partially ordered set (poset) $\mathbb P$ with $n$ elements. Under this interpretation, the different possible total orderings of these $n$ elements that would be consistent with your current knowledge correspond to the different linear extensions of the poset $\mathbb P$. So, the amount of information that you lack about these elements can be quantified in terms of the number of linear extensions of $\mathbb P$. At worst, this is $n!$, which is the number of linear extensions of a poset of size $n$ where no two elements are comparable.

Now suppose that you are given an additional value $v$, and asked to look it up in your data structure. In the case where this element is not present, you won't be able to answer the question negatively with certainty unless you know that $v$ is distinct from each individual value in your data structure. This means that for each element $x$ in your data structure, you will have to know either $x > v$ or $x < v$, since this is the only kind of information that can lead you to conclude that $x\neq v$. Hence, confirming that $v$ does not belong to your data structure is as good as knowing precisely which elements are above and below it.

This observation motivates the following definition. If $\mathbb P' \supset\mathbb P$ is a poset extending $\mathbb P$ with one more element than $\mathbb P$, such that this new element is comparable to each element of $\mathbb P$, we shall say that $\mathbb P'$ constitutes a cut of the poset $\mathbb P$. The observations we've just made suggest that the complexity of deciding that $v$ does not belong to our partially ordered data structure is related to the number of different cuts of $\mathbb P$. If the poset $\mathbb P$ representing our current knowledge has more possible cuts, then there are more possible ways that $\mathbb P$ could be divided into the elements above $v$ and the elements below $v$, hence we will have to ask more questions in order to find out which split is really the case.

To gain some intuition, we might consider the extreme cases. If $\mathbb P$ has no comparable pairs of elements at all, then there are $2^n$ different cuts, since each element could be either above or below $v$, independently of the others. If $\mathbb P$ is linearly ordered, then there are only $n+1$ cuts. Here's how I would picture the $8$ different cuts on an unordered set of $3$ elements:

And here's how I would picture the $4$ different cuts on a linearly ordered set of $3$ elements:

This is where the notions of chains and antichains comes in handy. In short, a chain of $\mathbb P$ is a collection of elements of $\mathbb P$ such that are linearly ordered amongst themselves, and an antichain of $\mathbb P$ is a collection of elements that are pairwise incomparable. There's an analogy to be made here with the class of data structures that we described earlier: you might think of a chain of $\mathbb P$ as an "ordered list" substructure embedded in our data structure, and an antichain of $\mathbb P$ as an "unordered bag" substructure embedded in it.

Let's use $A(\mathbb P)$ to denote the cardinality of the largest possible antichain of $\mathbb P$, that is, the antichain with the most possible elements. One can show that $\mathbb P$ therefore has at least $2^{A(\mathbb P)}$ different cuts. (I'll leave this as an exercise - it is not too tricky to prove if you use the Szpilrajn extension theorem.) This means that knowing the size of the maximal antichain of $\mathbb P$ gives us a lower bound on "how difficult" looking up a value in $\mathbb P$ can be in the worst case.

Another interesting and useful fact is Dilworth's Theorem, which states that there is a way of partitioning $\mathbb P$ into exactly $A(\mathbb P)$ chains. We can use this to derive an upper bound on the number of distinct linear extensions of $\mathbb P$. Using a stars-and-bars argument, we may claim that if the respective sizes of these chains are given by $c_1,\cdots, c_{A(\mathbb P)}$, then the number of linear extensions of $\mathbb P$ is at most We know that the sum of the chain sizes $c_i$ is equal to $n$, since the chains partition the $n$ elements of $\mathbb P$. Using log-convexity of the factorial function (rather, the Gamma Function), we can show that replacing each value $c_i$ with $n/A(\mathbb P)$ in the above expression can only increase its value. This allows us to claim that there are at most distinct linear extensions of $\mathbb P$.

There is one more key observation to make. Making a single additional non-redundant comparison query on the data that has been given is tantamount to extending our poset $\mathbb P$ to a new poset $\mathbb P'$ that is "more strictly ordered" than $\mathbb P$, in such a way that $\mathbb P'$ is generated by the comparisons already present in $\mathbb P$ combined with one additional comparison. Suppose that the elements $x,y\in\mathbb P$ are compared. Depending on the result of the comparison, either the relation $(x,y)$ or the relation $(y,x)$ is added to the resulting poset. We might denote by $\mathbb P_{x<y}$ the poset generated by $\mathbb P$ and $(x,y)$, and similarly $\mathbb P_{y<x}$ the poset generated by $\mathbb P$ and $(y,x)$. Each linear extension of $\mathbb P$ is either a linear extension of $\mathbb P_{x < y}$ or $\mathbb P_{y < x}$. Thus, by a pigeonhole argument, one of $\mathbb P_{x < y}$ or $\mathbb P_{y < x}$ must have at least half as many linear extensions as $\mathbb P$. We can conclude that from any given state of knowledge, in the worst case, carrying out a non-redundant comparison reduces the number of linear extensions by a factor of two - no choice can be made that guarantees that it will be reduced by a greater factor.

We have just argued that a single comparison can reduce the number of linear extensions of $\mathbb P$ by a factor of at most $2$, in the worst case. But earlier, we showed that if $\mathbb P$ has a maximal antichain of size $A(\mathbb P)$, then it has at most distinct linear extensions. Of course, when no comparisons have been carried out on the $n$ given values, there are $n!$ possible ways in which they could be ordered, corresponding to the $n!$ linear extensions of a totally unordered set of $n$ elements. This means that any strategy guaranteeing a state of knowledge with a maximal antichain of size $A(\mathbb P)$ also guarantees that the number of linear extensions is reduced by a factor of at least But each individual comparison can guarantee at most a decrease in the number of linear extensions by a factor of $2$. Thus, this guarantee requires at least comparison queries in the worst case. Using an asymptotic formula for the log-gamma function, we have that the above expression is Finally, recall our earlier argument that $\mathbb P$ has at least $2^{A(\mathbb P)}$ cuts. By similar reasoning, since each comparison query can reduce the number of possible cuts at most by a factor of $2$ in the worst case, we have that at least $\log_2(2^{A(\mathbb P)}) = A(\mathbb P)$ comparison queries are needed in the worst case to perform a successful lookup.

This gives us a useful relationship between the complexity of inserting the first $n$ elements into a data structure, and the complexity of subsequently looking up an element in that data structure. This relationship takes the form of a pair of lower bounds depending on the common quantity $A(\mathbb P)$. In the worst case, at least comparisons must take place in order to reliably arrive at a state of knowledge with at most $A$ antichains, and from a state of knowledge with $A$ antichains, at least $A$ further comparison queries must be made in the worst case in order to perform a lookup operation. In other words, in order to guarantee a worst-case lookup complexity of $A$ comparisons, we must frontload at least comparisons in the worst case.

Whew! That was a lot of reasoning! Let me give a concise play-by-play summary of the above argument, to wrap up this section:

• A poset $\mathbb P$ is used to describe the order relationships that we know about a collection of data.
• The number of different linear extensions of a poset $\mathbb P$ quantifies the information that we lack about the ordering of our data.
• The number of cuts of a poset $\mathbb P$ quantifies the information that we lack about the location of an element in our data structure.
• Making a single comparison query can cut down the number of possible linear orderings at most by a factor of two.
• A poset $\mathbb P$ with an antichain of size $A$ has at least $2^A$ different cuts.
• Therefore, locating an element in a poset $\mathbb P$ with an antichain of size $A$ requires at least $A$ comparisons in the worst case.
• When no queries have been made, there are $n!$ different linear extensions of $\mathbb P$.
• A poset $\mathbb P$ with maximal antichain size $A$ has at most $n!/\Gamma(1 + n/A)^A$ linear extensions.
• Therefore, at least $\log_2 \Gamma(1 + n/A)^A$ comparisons must be made in the worst case to guarantee that $\mathbb P$ has maximal antichain size at most $A$.
• Hence, guaranteeing that at most $A$ comparisons are needed to locate an element requires us to frontload at least $\log_2 \Gamma(1 + n/A)^A$ comparisons in the worst case.

### A case of asymptotic speedup

Let's apply this result to the specific case of interest, namely the problem of designing an optimal-lookup data structure with $\mathcal O(\log\log n)$ insertion complexity. Suppose that we insert $n$ elements into our data structure, one by one, and then perform a lookup. The above bound tells us that if the lookup complexity is bounded by $\mathcal O(A(n))$, then the total complexity of inserting all of the $n$ elements must be at least in the worst case. On the other hand, if we're requiring that the insertion complexity be $\mathcal O(\log\log n)$ in the number of elements, the total complexity of inserting the first $n$ elements must be at most Therefore, it must hold true that With a bit of algebra, one can see that this implies Thus, there exists some constant $p > 0$ such that But remember that earlier, we already constructed a whole family of data structures showing that the above asymptotic lookup complexity is attainable for any value of $p$, even while restricting the insertion complexity to $\mathcal O(\log\log n)$. This means that by simply choosing an arbitrary value $q > p$ and letting $\ell(n)$ be defined as follows: we can obtain a data structure with $\mathcal O(\log\log n)$ insertion complexity and strictly asymptotically better lookup complexity. In particular, this data structure will have better lookup complexity by a factor of $\Omega((\log n)^{q-p}/\log\log n)$. This is our "speedup factor" for a particular choice of $q > p$.

### Final thoughts

Last month, I graduated from UNM and completed my honors thesis in mathematics. My thesis was entitled Algebra of growth orders (you can access a somewhat recent draft here) and it's been a years-long project that began with a LaTeX document full of personal notes attempting to formalize certain properties about asymptotic growth orders that were often hand-waved in my algorithms classes. I've also written a couple of blog posts (number one, number two) outlining some questions about growth orders that have inspired my curiosity. I haven't written much about growth orders recently, but it's only because I've burned myself out editing and tweaking the final draft of my thesis, which has become a bulky document! I owe a huge thanks to Professor Cristina Pereyra for reading through the whole thing and giving detailed feedback.

All of this research on growth orders has been very "pure" in the sense that I've been studying the properties of growth orders as purely mathematical objects without making many connections to the applications of growth orders. But the applications are plentiful, whether they be in real analysis, number theory, or the analysis of computational complexity of algorithms. One of the things I've come to appreciate on a profound level during my research is the pathological nature of the space of growth orders. In practical applications, we generally only encounter a few commonly-occurring kinds of growth orders that become close friends (polynomial, logarithmic, exponential) and once in a blue moon we stumble upon a more exotic growth order (see the General Number Field Sieve algorithm). By the textbook definition of a growth order, there are many more growth orders than these, and they can behave in very unintuitive ways, to such an extent that a lot of the reasoning I've seen used in algorithms classes to manipulate growth orders could be seen not only as hand-wavy, but as plain wrong.

The problem I've laid out in this blog post is personally significant to me for two main reasons. First of all, as I mentioned above, we don't often see the unintuitive properties of growth orders carry over into unintuitive properties of algorithms or computational problems, and this problem is the first example of such a thing actually happening that has been elementary enough for me to understand. I'd say that asymptotic speedup is one of the most shocking phenomena in complexity theory, particularly since computer scientists are so used to saying things like "find the fastest algorithm". Secondly, while it's commonplace to see upper bounds on the runtime of algorithms, it's less common to see lower bounds on how efficient a solution to a particular computational problem can be at best. These lower bounds are hard to derive and usually require reasoning about some sophisticated general model of computation. Here, I've cheated a little by casting the problem in an information-theoretic light, but I still find it compelling.

This post also answers an old unanswered question of mine on CS Theory StackExchange, so I've added my own answer to the question posted by my former self. I gotta say, it's really satisfying when that happens.