What It’s Like To Solve a Math Olympiad Problem

And What This Tells Us About Creativity, Reasoning, And AI

Sep 17, 2024

It’s all fun and games watching AI gradually automate work that other people do. But when Google announced that their “AlphaProof” system can solve International Mathematical Olympiad problems, things got personal. Solving IMO problems is what I do. Or did, anyway; I was on the US team back in 1983 and 1984.

USA IMO Team, 1983 — The 1983 US IMO team in Paris. That’s me lookin’ good in the upper right.

When I read the AlphaProof announcement, I realized it represented an opportunity for me to probe the gap between human and AI problem-solving abilities. A lot of energy goes into evaluating AIs, but honestly most of us don’t understand human abilities all that well. It’s surprisingly difficult to reflect in detail on the capabilities that we rely on for creative problem solving. So I spent some time plowing through a few of this year’s IMO problems, taking notes of my thought process along the way. The results were illuminating.

The IMO is an elite competition, but the cognitive skills involved are nothing mysterious. I’m going to walk you through my journey on one particular problem that only uses elementary-school math, highlighting the mental tools I used. I’ll end with some brief thoughts on how this relates to other forms of intellectual work, and what that might mean for the future of AI.

Let’s begin!

The First Step Is To Understand the Problem

Here is question 3 from the 2024 International Mathematical Olympiad. As written, it’s pretty dense, so feel free to skip it – I’ll explain everything below. (In fact, feel free to skim past all of the math-y bits in this post; the important thing is the mental journey that connects them.)

This might seem baffling at first glance. It was baffling to me!

I had to read the problem statement a couple of times just to grasp the basic structure, and then I could go through slowly and work out what each phrase meant.

Let a₁, a₂, a₃, … be an infinite sequence of positive integers,

This is just a fancy way of saying “let’s write down some numbers”.

and let N be a positive integer. Suppose that, for each n > N, aₙ is equal to the number of times aₙ₋₁ appears in the list a₁, a₂, …, aₙ₋₁.

This says that the first N numbers in the list can be anything at all; I’m going to call that the preamble. And each number after that is generated by something I’ll call the counting rule. The counting rule says: look at the last number so far, see how many times it appears in the sequence, and write down that number. If we have “1 1 2 1”, then the next number will be three, because the last number was 1 and we’ve had three 1s.

Prove that at least one of the sequence a₁, a₃, a₅, … and a₂, a₄, a₆, … is eventually periodic. (An infinite sequence a₁, a₂, a₃, … is eventually periodic if there exist positive integers p and M such that bₘ₊ₚ = bₘ for all m ≥ M.)

Having to work out what that business with the p’s and M’s was getting at would have been a distracting chore. Fortunately, I was already familiar with the phrase “eventually periodic”, which just means “after a while, it falls into a repeating loop”. So, we have to prove that when we look at every other position in the sequence1, we’ll eventually find a loop – something like “3 7 3 7 3 7 3 7” repeating forever.

After a few careful read-throughs, I had a surface understanding of the problem. But I hadn’t the faintest idea how it could be solved. We’re asked to prove that the sequence eventually repeats, but all we’re given to work with is the fact that it follows the counting rule. There’s no obvious connection between counting and repetition. How to begin?

Tools used: Break down the problem statement into manageable pieces. Read-read each piece until you understand it. Have a mental library of important concepts in the problem domain (like “eventually periodic”). Assign shorthand names to important ideas (“preamble”, “counting rule”).

To Understand How Something Works, Observe It In Action

When studying for competitive math exams, one of the first things you’re taught to do with a new problem is to work out some examples. Once you’ve seen the rules of the problem play out a few times, you’ll have a better understanding of what they mean.

Consider the board game Risk. Reading the rules, you’ll learn that you get extra armies for controlling an entire continent, and you get an especially large number of armies if you control a large continent like Asia. But you have to play a few games to grasp the implications, such as:

Everyone wants extra armies, so they’ll be fighting you to control the nicest continents.
No one wants you to get extra armies, so if you conquer an entire continent, someone will make a point of spoiling it for you by recapturing one small part.
If you make an early show of force in a small continent, people might back off and let you keep it.
Never get involved in a land war in Asia.

By the same logic, I knew I would have to “play a few games” with the counting rule. I began with the simplest possible preamble – a single 1. Then I applied the counting rule. The last (and only) number was a 1, and there was one of it, so I wrote 1 again:

1 1

Now the last number was a 1, and there were two of it, so I wrote 2:

1 1 2

Now the last number was a 2, and there was one of those, so I wrote 1:

1 1 2 1

And so on:

1 1 2 1 3 1 4 1 5 1 6 1 …

It was immediately obvious that, when we look at every other number, they’re all 1s – exactly the sort of repeating sequence the problem asks for. This prompted me to try another standard trick: when you observe a pattern, see whether you can prove that pattern will always hold.

Tools used: when presented with a range of scenarios, work through a specific example, and look for a pattern.

Could It Be This Simple?

[This is a long post, because I wanted to present the entire proof, along with the thought process that got me there. If you’d prefer to get to the punchline, just read the first and last paragraph of each section, plus the entire last section.]

I could see a compelling logic to the sequence. We keep getting new numbers that are bigger than any number before. After each of those numbers, we’ll write a 1, because by definition the biggest-number-yet has only appeared once so far. And after each of those 1s, we write a new even-bigger number, because now we’ve had more 1s than ever before.

For instance, when the three appears, it’s the first 3, so we write a 1. And that’s the fourth 1, so we write a four. That’s the first 4, so we write the fifth 1, then the first 5, then the sixth 1, then the first 6, and so forth.

I wondered, maybe all sequences fall into this pattern? The idea was tantalizingly logical:

At some point, we’re going to get a number that’s bigger than any number before. Suppose it’s 37.
Because we’ve never had a 37 before, the next number will be 1.
But how did we get a 37 in the first place? Say that just before the 37, we had x. That is, suppose the sequence up to this point was …, x, 37, 1. Because we placed a 37 after the x, it must be that x had appeared exactly 37 times.
What could x be? Whatever it is, it is the number that is showing up most often. (If some other number y were appearing more often than x, then we would have 37 y’s before 37 x’s, and so this wouldn’t be the first time 37 has appeared.) For reasons I’ll explain below, we might expect that the number showing up most often is 1. And if x is 1, then now we have 38 of them (we just added another in step 2), so we’ll write a 38.
Now 38 is the biggest number yet, so we’ll add another 1, and the sequence will repeat: … 37 1 38 1 39 1 …

The only supposition here is in step 4, where I assumed that 1 is the number that has shown up most frequently. This struck me as a reasonable thing to suppose. Imagine, instead, that 3 is the most frequent number. That means we’ve written a bunch of 3s, which we would only have done if a bunch of numbers have shown up three times. Before those numbers showed up for a third time, they must have shown up for a first and second time, so we’d also have written at least as many 1s and 2s as 3s. It seems hard for 3 – or any number other than 1 – to consistently show up more often than 1.

Unfortunately, IMO problems require mathematical proof, and “it seems hard” is not a proof. I tried to come up with a formal proof that, after some initial settling period, 1 would always show up more often than any other number. But I couldn’t come up with any good ideas; whatever half-baked ideas did occur were quickly revealed to have flaws.

Another trick I’ve learned over the years: if something seems true, but you can’t prove it, maybe it isn’t really true. I decided to try more examples, to test whether 1 always shows up most often.

Tools used: when you find a pattern, look for a principle which would explain the pattern. To test your logic, try to prove that the principle is correct.

When In Doubt, Try More Examples

I wanted to come up with a preamble that would turn out differently than my initial example, but was still fairly simple. I went with “1 2 2”. Here’s how that turned out – note that I’m adding a # to separate the preamble from the numbers generated using the counting rule:

1 2 2 # 2 3 1 2 4 1 3 2 5 1 4 2 6 1 5 2 7 1 6 2 …

This time, the repeating pattern is 1 2 1 2 1 2. Twos are showing up just as often as ones, thus disproving my idea that 1 must always appear most often.

On inspection, could see that this is because the preamble gives 2 a head start: it shows up twice, while 1 only appears once. Then things settle into a pattern where, whenever a new biggest number shows up, it also shows up twice. Because each new number shows up twice, we perpetually get as many twos as ones.

At this point, I wasn’t sure where to go, so I sketched some more examples. I tried to steer toward as many different situations as possible – starting with longer sequences, more 2s than 1s, or only using 3s:

1 2 # 1 2 2 3 1 3 2 4 1 4 2 5 1 5 2 …
1 2 1 2 # 2 3 1 3 2 4 1 4 2 5 1 5 …
1 2 1 2 1 2 # 3 1 4 1 5 1 …
1 2 2 2 2 # 4 1 2 5 1 3 1 4 2 6 1 5 2 7 1 6 2 8 1 …
3 3 # 2 1 1 2 2 3 3 4 1 3 5 1 4 2 4 2 5 2 6 1 5 3 6 2 7 1 6 3 7 2 8 1 7 3 …

A pattern always emerges, but in the last example it’s more complicated (the repeating sequence is 2 1 3), and in that example the pattern doesn’t emerge until the counting rule has been applied 18 times. At this point I was baffled – what mechanism could allow the sequence to unfold in so many different ways, and yet ensure that it always repeats?

It was time to try another approach.

Tools used: if you get stuck, try more examples. Try to cover a range of scenarios.

Working Without a Plan

I had no idea how to prove that the sequence will always repeat. I decided to see what I could prove.

I noticed that, in the sequences I’d written out, sometimes there’s an messy period before things settle down into a repeating pattern. The messy period clearly had something to do with the numbers in the preamble. Also clearly, as we go deeper into the sequence we get bigger numbers, so eventually we’ll be getting mostly numbers that are bigger than anything in the preamble. Maybe that’s a necessary step toward repetition – we need to climb past the numbers in the preamble?

As a mental shorthand, I decided to use m to refer to the biggest number in the preamble. And then I realized that I could indeed prove something about numbers bigger than m: each such number can only show up a limited number of times.

To prove this, I used a standard trick called proof by contradiction. You assume the thing you want to prove isn’t true, and then show that leads to an impossibility. Here, I used a common variation where you specifically look for the first time your hypothesis isn’t true.

Proposition 1: for any number x > m, x appears at most m times.
Proof: suppose the proposition is not true, i.e. that some numbers larger than m show up more than m times. Find the first time that happens – the first place in the sequence where some number x > m shows up for the m+1’st time.

Because x > m, and m is the largest number in the preamble, x can’t appear in the preamble. All m+1 times that it has shown up must have been generated by the counting rule. This means that there are m+1 numbers that have shown up x times. What are those numbers that have shown up x times?
We know that x is the first number greater than m to show up at least m+1 times, so all of the numbers that have shown up x times (remember x > m) are less than or equal to m. But there are only m positive integers less than or equal to m. We’ve arrived at our impossibility: it’s impossible for m+1 numbers to have shown up x times.

Apologies if this is hard to follow; again, feel free to skip the propositions and proofs. Each step is logically straightforward, but there are a lot of pieces to keep track of, and they’re quite abstract. This points at one of the skills necessary for solving this type of problem: you need to have practice at manipulating abstract concepts. Later on I started to struggle with this, but for the moment I was on a roll.

Tools used: if you can’t come up with a plan, try striking out at random. Look for partial patterns, and try to explain them. Have a collection of tools from the problem domain (such as proof by contradiction), and know when to use them.

Building Momentum

I kept turning the problem over in my mind, looking for further statements I could prove – and finding them. Here’s a slightly cleaned up version of how that went. I’d already proved that big numbers (numbers bigger than m) show up a limited number of times. Next, I found a way to prove something else I’d observed: there are always small numbers that keep showing up.

Proposition 2: the number 1 appears an infinite number of times.
Proof: whenever a number appears for the first time, we add a 1 to the sequence. So if 1 appears only a finite number of times, that means the number of distinct numbers is finite. Suppose the largest number that ever appears is M.

As we build our sequence, we’re always adding numbers between 1 and M. Eventually, one of those numbers must show up for the M+1st time. And then we’d add M+1 to the sequence, which contradicts the idea that M was the largest number to appear.

Proposition 3: all of the numbers that appear an infinite number of times are adjacent. For instance, we might have 1, 2, and 3 all appearing an infinite number of times, but we’d never get 1, 2, and 4 (without 3).
Proof: for x to appear an infinite number of times, there must be an infinite number of numbers which appear x times. Each of those numbers must first appear x-1 times, so if x appears an infinite number of times, so must x-1. Thus, there can’t be any gaps in the set of numbers that appear an infinite number of times.

At this point, I was starting to get a headache. Look at the first sentence in the proof of proposition 3. “x” appears in two different roles. First, it’s mentioned as a thing that shows up in the sequence. Then, it’s mentioned as a count of other numbers in the sequence. The fact that each number in the sequence has a dual role, first as a count, then as a thing, kept tangling me up. It was like my brain had to keep yanking each number out of one mental slot and drop it into another, breaking all the connections.

To drive this home, consider one of the example sequences I worked out above: 1 2 2 2 3 1 2 4 1 3 2 5. Here’s how we might describe that sequence:

The preamble is 1 2 2. We’ve had two 2s, so then we add a 2. Then we’ve had three 2s, so we add a 3. Then we’ve had one 3, so we add a 1. Then we’ve had two 1s, so we add a 2. …

Try reading that out loud, see how confusing it is? I’d had a lot of practice with abstraction, but this was getting to be too much. Still, I managed to prove a couple more propositions that really narrowed down the way the sequence could play out:

Proposition 4: Let k be the largest number that appears an infinite number of times. Then we can find some number K such that all numbers greater than K appear at most k times. (To support later steps, we also choose K to be larger than m and larger than the size of the preamble.)
Proof: k+1 appears a finite number of times, so there are a finite number of numbers which appear at least k+1 times. Let K be the largest of those numbers. By definition, all numbers greater than K appear at most k times.

Proposition 5: things eventually settle into an alternation of large and small numbers. Once all numbers from 1 to k have appeared at least K times, the sequence alternates between numbers no larger than k and numbers larger than K.
Proof: assume all numbers from 1 to k have appeared at least K times. (This must happen eventually, since by proposition 3, those numbers eventually appear an infinite number of times.) The next time one of these numbers appears, it will be appearing for at least the K+1st time, so it will be followed by a number larger at least equal to K+1. That number can only have appeared up to k times (proposition 4), so the number after that will be less than or equal to k. Each number ≤ k is followed by a number > K, and each number > K is followed by a number ≤ k.

Proposition 5 was heartening: I’d proved that the sequence eventually settles into an alternation! Small numbers (less than or equal to k) alternate with big numbers (greater than K). The problem calls for us to prove that something is true about every other number in the sequence, so it felt promising to have proved that some sort of alternation arises. Now I just needed to prove that, within that alternation between large and small numbers, the small numbers would eventually repeat.

Tools used: continue on a path so long as you’re making useful progress. Look for steps that take you closer to the end goal.

Searching For Proof In All The Wrong Places

I had previously noticed that the sequence often starts out looking messy, and then gradually settles into a repeating pattern. Sometimes the messy period lasts for a while. Recall that, in one of the examples I worked out earlier, the simple prologue “3 3” requires 18 steps before settling down:

3 3 # 2 1 1 2 2 3 3 4 1 3 5 1 4 2 4 2 5 2 6 1 5 3 6 2 7 1 6 3 7 2 8 1 7 3 …

I felt that if I could understand the process by which this occurs, I could prove that all sequences settle down. I sketched a few more examples, trying to push the prologue in different directions and understand how this impacted the settling process:

1 2 2 2 2 2 2 # 6 1 2 7 1 3 1 4 1 5 1 6 2 8 1 7 2 9 1 8 2
1 5 5 5 5 5 5 # 6 1 2 1 3 1 4 1 5 7 1 6 2 2 3 2 4 2 5 8 1 7 2 6 3 3 4 3 5 9 1 8 2 7 3 6 4 4 5 10 1 9 2 8 3 7 4 6 5
1 3 # 1 2 1 3 2 2 3 3 4 1 4 2 4 3 5 1 5 2 5 3 6 1

I couldn't discern any concrete pattern to how things settle down, but I did have a fuzzy sense. Eventually we'll reach a point where some set of small numbers have all appeared roughly the same number of times, and then those numbers repeat forever. But at first, some of those numbers haven't appeared "enough" times, and they need to "catch up" before the repetition can begin. For instance, again in the second-to-last example above, we need a lot of 1s, 2s, 3s, and 4s to catch up with all the 5s.

But despite what struck me as the appealing logic of this approach, I didn’t see anything I could prove about it. So I pulled another tool from the toolbox: look at the situation from a different direction.

Tools used: if you get stuck, look for another approach.

The Big Numbers Are Just A Distraction

It dawned on me that all of the interesting stuff that was going on had to do with those small numbers. Here's that example sequence I keep referring to:

1 5 5 5 5 5 5 # 6 1 2 1 3 1 4 1 5 7 1 6 2 2 3 2 4 2 5 8 1 7 2 6 3 3 4 3 5 9 1 8 2 7 3 6 4 4 5 10 1 9 2 8 3 7 4 6 5

Big numbers like 8, 9, and 10 all play exactly the same role. Each of them is going to show up exactly 5 times (Proposition 4, which I'd already proved at this point), always in alternation with the small numbers (Proposition 5). Basically, these big numbers are just tokens that keep track of how often the small numbers have appeared. All of the interesting structure in the sequence, it seemed to me, is determined by the small numbers.

I decided to try writing the sequence in a form that would emphasize any patterns in the small numbers. Because this sequence has 5 small numbers that keep repeating (k=5), I made a 5-row table. The first row showed the number of 1s at each point in the sequence, the second row showed the number of 2s, and so forth. Each column represents a position in the original sequence, and I circled the entry corresponding to the small number which had shown up most recently. Here's what that looked like:

Does this reveal anything to you, other than the fact that I have bad handwriting? Me either. But despite the lack of progress, I was still stuck on the idea that I had to show how the sequence would “settle down”. Rummaging through the list of proof techniques I’d been taught, I found one that might help.

Tools used: try transforming the problem into an alternative representation that might reveal new patterns.

Looking For a Measure of Disorder

On further review, I did notice one thing about the diagram I’d drawn. If you look at the pattern of circled numbers, at first they are jumping all over the place, but gradually they settle down into tidy diagonal lines. If I could prove that this always happened, I could probably turn that into a formal proof of the original question.

I wanted to find a way to measure how “unsettled” the sequence was. If I could find a formal definition of unsettledness, then perhaps I could prove that unsettledness must always go down, forcing the sequence to settle into a repeating pattern.

This idea comes up in many mathematical proofs. For instance, suppose someone gives you a list of numbers and asks you to sort them in numerical order. Here's one way of doing that:

Look for a pair of adjacent numbers that are out of order.
If you can't find one, then you're done! The numbers are all in order.
Otherwise, swap those numbers and go back to step 1.

For instance, suppose we start with the numbers 4 1 3 2. We might proceed as follows:

4 1 3 2 → swap 4 and 1

1 4 3 2 → swap 3 and 2

1 4 2 3 → swap 4 and 2

1 2 4 3 → swap 4 and 3

1 2 3 4 → nothing more to swap!

Could there be a situation where we keep swapping numbers forever without ever getting them all in order? It turns out that we can prove that never happens, by coming up with a measurement for how out-of-order the list is. We say that the “unsortedness” of a list is the number of pairs of numbers which are in the wrong order. In the initial list above, the following pairs are out of order: 4-1, 4-3, 4-2, and 3-2, so the unsortedness is 4.

Whenever we swap a pair of adjacent numbers that were out of order, we reduce the unsortedness by exactly 1. The numbers we swapped used to be out of order, now they are in order. And every other pair of numbers remains in the order it was in previously. Thus, the unsortedness decreases by 1. If the unsortedness of the initial list is 4, then it will take exactly 4 swaps to put it in order – as we see above.

I thought that I could use a similar approach to prove that the number sequences for this IMO problem always settle into a repeating pattern. I just needed to find an appropriate definition of “unsettledness”. But no matter how long I started at my yellow pad, I kept drawing a blank. It was time to look for another way forward.

Tools used: when faced with a problem, try applying a known technique for that type of problem. But recognize when it isn’t working.

If One Tool Doesn’t Work, Try Another

Around this time, I remembered another classic trick for proving that a sequence eventually repeats. If you have a sequence where each entry depends only on the previous entry, and there are a limited number of possible entries, then eventually the sequence must hit on an entry it has used before. Once that happens, it will repeat forever.

For instance, suppose that we generate a sequence of numbers using the following rule:

Start with a one-digit number.
Add 3, and keep the last digit.
Repeat step 2 forever.

If we start with 3, we get the following sequence:

3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 ...

The eleventh number is a 3, the same as the first number. After that the sequence starts over, always repeating the same ten values (3 6 9 2 5 8 1 4 7 0) in the same order. If we had started with a different number, we’d wind up in the same loop. For instance:

2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 6 9 ...

There are only ten possible numbers, so eventually the sequence has to repeat itself. And then it will keep repeating in exactly the same way, because it’s always following the same rule.

This trick won’t work for our Olympiad problem, because each new number is a function of the entire past history of the sequence, and that history is always growing. But still, we need to prove that a sequence eventually repeats, and this is a technique for doing just that. I asked myself, is there some way of stretching it to fit the problem?

Tools used: when one technique fails, try another. If it doesn’t immediately work out, exercise judgement as to whether it’s worth exploring further.

Finding the Right Representation

In the IMO problem as stated, each new number depends on the entire past history of the sequence. But in the grid representation that I developed earlier, each column in the grid depends only on the previous column2. If I could just prove that there are a finite number of arrangements of numbers that can show up in a column, I’d be done.

Unfortunately, there are an infinite number of arrangements of numbers, because the numbers keep getting bigger. I wondered, could I find some way of distilling the “essence” of each column, such that there would only be a finite number of possible “essences”? (This idea, of transforming something into a new form that better suits your purposes, is another standard tool in the mathematical toolbox.)

I wondered whether the ranking order might repeat. By ranking order, I mean ranking the numbers 1 through k (in our example, 1 through 5) according to how many times each one has shown up so far. There are only so many ways of ranking k numbers, so eventually the same ranking would have to show up for a second time. Unfortunately, that doesn't lead to a repeating sequence, because the ranking in one column is not determined solely by the ranking in the previous column. Hmph.

Then I had another idea: the numbers in that grid diagram seem to be increasing more or less in lockstep. could I prove that the numbers in a given column always remain fairly close together? For instance, if k is 5, that would mean proving that in any given column, the difference between the number of 1s, the number of 2s, the number of 3s, the number of 4s, and the number of 5s could never exceed some limit. (In the example, that limit turns out to be 6.) If the differences are capped, then there are only so many different arrangements they can take on – there is a finite number of arrangements of differences between the numbers in a column. And if you know the differences between the numbers in one column, that is all you need to figure out the differences between the numbers in the next column. Therefore, the differences would eventually have to fall into a repeating loop! Was that daylight I saw at the end of that tunnel?!?

Given the facts I'd already proved (propositions 1 through 5), it turned out to be not too difficult to prove that there is, indeed, a limit on the difference between how many times any two small numbers appear in the sequence. That means the pattern of differences must eventually repeat, which means the pattern of small numbers must eventually repeat, which means I had solved the IMO problem!

(Fine, I probably spent at least 2 or 3 times as much time as IMO competitors are allowed. Give me a break, I’m rusty.)

Proposition 6: across all positions in the sequence, there is a finite upper bound on the difference between the number of times each of the infinitely-repeating numbers (numbers in the range 1…k) have appeared.
Proof: by contradiction. Let us say that for two numbers i and j, both in 1…k, i “lags” j if j appears increasingly more often than i. (More precisely, i lags j if for any number you choose, I can point to a location in the sequence where j has appeared that many times more than i.)
Suppose i is the smallest number such that lags some other number, and let j be one of the numbers that it lags. Each appearance of j in the sequence is preceded by a number that has appeared j times. Thus, j must be less than i; otherwise those numbers would have previously appeared i times, meaning that i would be appearing as often as j.
Because i is the smallest laggard, none of the numbers smaller than i can lag one another; there is a finite bound on how far apart their appearance counts can diverge. Call that limit b, and find a point in the sequence where j has appeared at least b+1 times more than i. At that point in the sequence, it must be the case that all numbers less than i have appeared more times than i has. It must also therefore be the case that all numbers less than i have appeared more times than any numbers in the range i+1 … k, because those numbers can’t have appeared more times than i has (using the same logic as in the previous paragraph).
Recall that according to proposition 5, after some early period, the sequence alternates between small numbers (no larger than k) and large numbers. We’ve shown that the numbers 1…i-1 have all appeared more often than the numbers i…k. This means that whenever a number in the range 1…i-1 appears, it will be followed by a number so large that it has shown up at most i-1 times. That number, in turn, will be followed by a number no greater than i-1. So the numbers i…k will never appear again, contradicting the idea that i appears infinitely often.
This proof can be summarized using the following intuition: the numbers that appear infinitely often must show up more or less in “lockstep”. If a large number “falls behind” the smaller numbers, then it will never appear again; and a small number can’t “fall behind” a larger number, because of the way the counting rule works.

Tools used: if there’s a standard tool that doesn’t quite fit your situation, look for a way to reframe the situation to make it fit.

All Creative Problem Solving Looks Like This

Why did I just drag you through over 5000 words on a niche activity? Because I think the mental toolbox it demonstrates is one that we all use every day.

The final proof follows a tidy linear structure, but the process that led me there was anything but. I started out by simply re-reading the problem statement until I understood it, and then working out examples until I started to grasp the implications. I worked forward (what can I prove?) and backward (how can I set up the conditions for a standard tool for proving that a sequence repeats)? I explored multiple paths, sometimes fighting through difficulties, sometimes giving up and trying a different tack. I assigned shorthand names to concepts like “preamble” or “laggard”, and I looked for alternative framings that might make the problem more tractable.

Those are all strategies. I also relied on a rich library of specific mathematical tools and concepts: proof by contradiction, “eventually periodic”, a sequence must repeat if there are a finite number of possible entries and each one depends only on the previous entry, and so on.

When you architect a software module, organize a trip, make a lesson plan, or plot a short story, you use similar tools. You need to understand the problem, explore options, work forwards and backwards, come at the situation from another angle. And you’ll certainly rely on a library of ideas and techniques, along with muscle memory of how to apply them.

AIs are infamous for being hit-and-miss in their capabilities. In some respects they are already superhuman, in others they are bafflingly inept. Understanding the richness of the mental toolkit we all use every day – it’s so much more than a simple “chain of thought” – will help us gauge progress toward true AGI, and evaluate the potential of AlphaProof and OpenAI’s new “o1” model. I’ll say more about this in an upcoming post.

More precisely, either the even positions (a₂, a₄, a₆, …) or the odd positions (a₁, a₃, a₅, …); we can pick either one.

More precisely, this is true once we get to the point where every number from 1 through k has appeared at least m' times. In the sequence represented in the grid that I drew, this means that 1 through 5 have each appeared at least 7 times. From that point forward, each column in the grid is identical to the previous column except that we add 1 to one row. That row is determined by the rank of the number we incremented in the previous column. For instance, if in the previous column the number we incremented is the fourth-highest number, then in the next column we increment the number in the fourth row.

Greg Burnham

This is a terrific post!! One question for you. This is the third problem on the first day of the 2024 IMO (P3). AlphaProof didn't solve this one, but did solve the third problem on the second day (P6). These are both very hard problems for humans: each was only solved by only about 2% of contestants. So, it's just one data point, but what do you think might account for the difference in AlphaProof's performance?

My hunch, FWIW, is that the re-representation of the problem in the grid format makes it trickier for AlphaProof. I've looked up a few solutions to P3, and they all make use of a similar technique at some point. P6 requires a similarly long and intricate build-up of observations and intermediate results, but I would say it doesn't really require seeing the problem in a new way. In other words, both problems are deep, but P3 requires more creativity. Then again, it's also possible that the reason is more technical: maybe combinatorics is just clunkier to represent in Lean, making the solution necessarily longer, and thus yielding a bigger search space; I don't know anything about that.

Curious for your take, though!

Expand full comment

1 reply by Steve Newman

tasdourian

Sep 18

Great Post! I was actually hoping you'd write a post like this, detailing what strategies high-level problem solvers use and how they use them. (I know you said all of us do, and that's true to a degree, but c'mon now...)

I also am very eager to see your analysis of the strengths and limitations of o1 mini and o1 preview-- I've found o1 mini to be much more capable and sophisticated at math/physics problems, and o1 preview has been able to solve the last 3 NYTimes Connections puzzles I've given it, whereas the previous 4o model couldn't solve these at all. Especially because Connections was the kind of puzzle where it seemed to require too much creativity/originality to just solve it using previous training data.

Anyway, I'd be interested in a hype-free analysis of the current new models, if that seems useful to you.

4 more comments...

Am I Stronger Yet?

Discussion about this post