BusyBeaver(6) Is Quite Large

7 months ago (scottaaronson.blog)

People on the bbchallenge Discord server are keen to speculate on how many Turing Machine states are needed to surpass Graham's Number, which is vastly larger than the 2^^2^^2^^9 achieved by the latest BB(6) champion.

We know from the functional busy beaver [1] that Graham behaviour can come surprisingly early; a 49-bit lambda term suffices. There are only 77519927606 closed lambda terms of at most that size [2], compared to 4^12*23836540=399910780272640 unique 6-state Turing Machines [3].

With the achievement of pentation in only 6 states, several people now believe that 7 states should suffice to surpass Graham's. I would still find that rather surprising. A few days ago, I made a large bet with one of them on whether we would see proof of BB(7)>Graham's within the next 10 years.

What do people here think?

[1] https://oeis.org/A333479

[2] https://oeis.org/A114852

[3] https://oeis.org/A107668

  • I can't pretend to be an expert, but I'll argue BB(7) is probably larger than Graham's number.

    BB has to grow faster than any computable sequence. What exactly that means concretely for BB(7) is... nothing other than handwaving... but it sort of means it needs to walk up the "operator strength" ladder very quickly... it eventually needs to grow faster than any computable operator we define (including, for example, up-arrow^n, and up-arrow^f(n) for any computable f).

    My gut feeling is that the growth between 47 million and 2^^2^^2^^9 is qualitatively larger than the growth between 2^^2^^2^^9 and graham's number in terms of how strong the operator we need is (with gramah's number being g_64 and g here being roughly one step "above" up_arrow^n). So probably we should have BB(7)>Graham's number.

    • Apologies if this feels adversarial, but I think your informal proof has an error, and I think I can explain it!

      Your proof rests primarily on this assertion:

      > BB has to grow faster than any computable sequence.

      This is almost true! BB(n) has to grow faster than any computable sequence _defined by an n-state Turing machine_. That last part is really important. (Note that my restatement is probably incorrect too, it is just correct enough to point out the important flaw I saw in your statement). This means that up-arrow^f(n) _can_ be larger than BB(n) — up-arrow^f(n) is not restricted by a Turing machine at all. As an easy example, consider f(n) = BB(n)^2.

      You may still be right about BB(7) being bigger than Graham’s number, even if your proof is not bulletproof

      9 replies →

    • I can’t edit my original comment anymore, but replying to say I went on a Wikipedia binge and I think you’re right. Thanks for humoring me and helping me learn!

      1 reply →

It boggles my mind that a number (an uncomputable number, granted) like BB(748) can be "independent of ZFC". It feels like a category error or something.

  • What makes BB(748) independent of ZFC is not its value, but the fact that one of the 748-state machines (call it TM_ZFC_INC) looks for an inconsistency (proof of FALSE) in ZFC and only halts upon finding one.

    Thus, any proof that BB(748) = N must either show that TM_ZF_INC halts within N steps or never halts. By Gödel's famous results, neither of those cases is possible if ZFC is assumed to be consistent.

    • I think what's most unintuitive is that most (all?) "paradoxes" or "unknowables" in mathematics involve infinities. When limiting ourselves to finite whole numbers, paradoxes necessarily disappear.

      BB(748) is by definition a finite number, and it has some value - we just don't know what it is. If an oracle told us the number, and we ran TM_ZFC_INC that many steps we would know for sure whether ZFC was consistent or not based on whether it terminated.

      The execution of the turing machine can be encoded in ZFC, so it really is the value of BB(748) that is the magic ingredient. Somehow even knowledge of the value of this finite number is a more potent axiomatic system than any we've developed.

      13 replies →

    • > Thus, any proof that BB(748) = N must either show that TM_ZF_INC halts within N steps or never halts. By Gödel's famous results, neither of those cases is possible if ZFC is assumed to be consistent.

      Isn't it more accurate to say that any proof that BB(748) = N in ZFC must either show that TM_ZF_INC halts within N steps, or never halts?

      Meaning, it's totally possible to prove that BB(748) = N, it just can't be done within the axioms of ZFC?

    • Does the fact that BB(k)=N is provable up to some k < 748 mean that all halting problems for machines with k states are answered by a proof in ZFC?

      2 replies →

    • I don't understand, surely if we assume ZFC is consistent then it's obvious that it won't halt? Even if its consistency can't be proven, neither can its inconsistency, so it won't halt. Or is that only provable outside of ZFC?

      I guess it's also hard when we have an arbitrary Turing machine and have to prove that what it's doing isn't equilavent to trying to prove an undecibable statement.

      2 replies →

  • > It boggles my mind that a number (an uncomputable number, granted) like BB(748) can be "independent of ZFC".

    It's BB(n) that is incomputable (that is there's no algorithm that outputs the value of BB(n) for arbitrary n).

    BB(748) is computable. It's, by definition, a number of ones written by some Turing machine with 748 states. That is this machine computes BB(748).

    > It feels like a category error or something.

    The number itself is just a literally unimaginably large number. Independence of ZFC comes in when we try to prove that this number is the number we seek. And to do that you need theory more powerful than ZFC to capture properties of a Turing machine with 748 states.

  • It boggles my mind that we ever thought a small amount of text that fits comfortably on a napkin (the axioms of ZFC) would ever be “good enough” to capture the arithmetic truths or approximate those aspects of physical reality that are primarily relevant to the endeavors of humanity. That the behavior of a six state Turing machine might be unpredictable via a few lines of text does not surprise me in the slightest.

    As soon as Gödel published his first incompleteness theorem, I would have thought the entire field of mathematics would have gone full throttle on trying to find more axioms. Instead, over the almost century since then, Gödel’s work has been treated more as an odd fact largely confined to niche foundational studies rather than any sort of mainstream program (I’m aware of Feferman, Friedman, etc., but my point is there is significantly less research in this area compared to most other topics in mathematics).

    • This ignores the fact that it is not so easy to find natural interesting statements that are independent of ZFC.

      Statements that are independent of ZFC are a dime a dozen when doing foundations of mathematics, but they're not so common in many other areas of math. Harvey Friedman has done interesting work on finding "natural" statements that are independent of ZFC, but there's dispute about how natural they are. https://mathoverflow.net/questions/1924/what-are-some-reason...

      In fact, it turns out that a huge amount of mathematics does not even require set theory, it is just a habit for mathematicians to work in set theory. https://en.wikipedia.org/wiki/Reverse_mathematics.

      2 replies →

    • > As soon as Gödel published his first incompleteness theorem, I would have thought the entire field of mathematics would have gone full throttle on trying to find more axioms.

      But why? Gödel's theorem does not depend on number of axioms but on them being recursively enumerable.

      4 replies →

    • Within ZFC one can prove that any two models of second order PA are isomorphic. ZFC proves that PA is consistent. ZFC is good enough to capture arithmetical truth.

      4 replies →

  • The number itself is not independent of ZFC. (Every integer can be expressed in ZFC.) What's independent of ZFC is the process of computing BB(748).

    • I think the more correct statement is that there are different models of ZFC in which BB(748) are different numbers. People find that weird because they don't think about non-standard models, as arguably they shouldn't.

      43 replies →

    • Sure, if someone just gives you the number, ZFC can represent it. But ZFC cannot prove that the value is correct, so how do you know you have the right number? Use a stronger proof system? Go a bit bigger and same issue.

      28 replies →

    • To try and help people digging into this, the following helped me.

      Two lenses for trying to understand this are potentially Chastain's limits on output of a lisp program being more complex than the program itself [1] or Markov's proof that you can't classify manifolds in d>= 4.

      If you try the latter and need/want to figure out how the Russian school is so different this is helpful [2]

      IMHO the former gives an intuition why, and the latter explains why IMHO.

      In ZFC, C actually ends up implying PEM, which is why using constructionism as a form of reverse math helped it click for me .

      This is because in the presence of excluded middle, every sequentially complete metric space is a complete space, and we tend to care about useful things, but for me just how huge the search space grows was hidden due to the typical (and useful) a priori assumption of PEM.

      If you have a (in my view) dislike for the constrictive approach or don't want/have to invest in learning an obscure school of it, This recent paper[3] on the limits for finding a quantum theory of everything is another lens.

      Yet another path is through Type 2 TMs and the Borel hierarchy, where while you can have a uncomputable number on the input tape you algorithms themselves cannot use them, while you can produce uncomputable numbers by randomly selecting and/or changing an infinite sequence.

      Really it is the difference between expressability and algorithms working within what you can express.

      Hopefully someone else can provide more accessible resources. I think a partial understanding of the limits of algorithms and computation will become more important in this new era.

      [1] https://arxiv.org/abs/chao-dyn/9407003 [2] https://arxiv.org/abs/1804.05495 [3] https://arxiv.org/abs/2505.11773

      4 replies →

  • No individual number is uncomputable. There’s no pair of a number and proof in ZFC that [that number] is the value of BB(748). And, so, there’s no program which ZFC proves to output the value of BB(748). There is a program that outputs BB(748) though, just like for any other number.

    • Individual numbers can be uncomputable! For example, take your favorite enumeration of Turing machines, (T1, T2...) and write down a real number in binary where the first bit is 0 if T1 halts and 1 otherwise, second bit is 0 if T2 halts... clearly this number is real and between 0 and 1, but it cannot be computed in finite time.

      3 replies →

    • I think your mistake is your claim that BB(748) is a natural number. For you to know that, you would necessarily have to know an upper bound for the number of steps it takes for the BB-748 machine (whichever machine it is) to halt. But you definitely don't know that.

      Related: It's incorrect to claim that each machine either halts or doesn't halt. To know that that dichotomy holds would require having a halting problem algorithm.

      1 reply →

  • Let X = "1 if ZF is consistent, 0 otherwise". Then the statements "X = 0" and "X = 1" are independent of ZF. Whether the definition of X is a satisfactory definition of a particular number is a question of mathematical philosophy.

    BB(748) is very similar, in that I'd call it a 'definition' independent of ZF rather than a 'number' independent of ZF.

  • The truth value of the continuum hypothesis is either 1 or 0 (at least from a platonistic perspective). But, it is proven to be independent of ZFC. No huge numbers involved, just a single bit whose value ZFC doesn't tell you.

  • Many replies don't seem to understand Godel and independence (and one that might is heavily downvoted). Cliff notes:

    * ZFC is a set of axioms. A "model" is a structure that respects the axioms.

    * By Godel, we know that ZFC proves a statement if and only if the statement is true in all models of ZFC.

    * Therefore, the statement "BB(748) is independent of ZFC" is the same as the statement "There are two different models of ZFC where BB(748) are two different numbers.

    * We can take one of these to be the "standard model"[1] that we all think of when we picture a Turing Machine. However, the other would be a strange "non-standard" model that includes finite "natural numbers" that are not in the set {0,1,2,3,...} and it includes Turing Machines that halt in "finite" time that we would not say halt at all in the standard model.

    * So BB(748) is indeed a number as far as the standard model is concerned, the problem only comes from non-standard models.

    TL;DR this is more about the fact that ZFC axioms allow weird models of Turing Machines that don't match how we think Turing Machines usually work.

    [1] https://en.wikipedia.org/wiki/Non-standard_model_of_arithmet...

    • I would edit my last line to say: weird models of numbers that don't match how we think "halts in finite steps" usually works.

  • BB(748) is a natural number, and _all_ natural numbers are computable.

    • There is definitely a function f such that f() = n for all n ∈ ℕ.

      But there is also a function g that you cannot prove whether g() = n.

      Important distinction.

      This means that somebody could claim that the value of BB(748) = n but you cannot be sure if they are correct (but you might be able to show they are wrong).

  • The category error is in thinking that BB(748) is in fact, a number. It's merely a mathematical concept.

    • No, that's one of the freakiest things about things like the Busy Beaver function. There is an exact integer that BB(748) defines. You can add one to it and then it would no longer be that number anymore.

      If you are refering to the idea that nothing that can't exist in the real universe "really exists", then the "Busy Beaver" portion of that idea is extraneous, as 100% of integers can't exist in the real universe, and therefore, 100% of integers are equally just "mathematical concepts". That one of them is identified by BB(748) isn't a particularly important aspect. But certainly, a very specific number is identified by that designation, though nothing in this universe is going to know what it is in any meaningful sense.

      11 replies →

    • There is a finite number of Turing machines of size 748. The number of them that eventually halt is thus also finite, and BB(748) is the highest number of steps in the finite list of how many steps each took to halt. It has to be a number.

      We just can't prove which number it is, we don't know which of the machines halt.

    • Let S be a statement. S is called semidecidible (also: Turing recognizable, most commonly "recursively enumerable", abbreviated as "r.e.", but I hate that one) if there is a Turing machine that halts if and only if S is true.

      With this definition, we can say that "ZFC is inconsistent" is semidecidible: you run a program that searches for a contradiction.

      The question BB(748) =/= 1000 is similarly semidecidable. You can run a program that will rule out 1000 if it is not BB(748).

      So they are in the same "category", at least regarding their undecidability.

      Also, if you turn "ZFC is consistent" into a number: {1 if ZFC is consistent; 0 if ZFC is inconsistent}, you will see, that BB(748) is not very much different, both are defined (well, equivalently) using the halting of Turing machines, or, the result of an infinite search.

      1 reply →

    • A constructive mathematician would indeed deny that BB(748) is a well defined number. One could define it as a predicate on natural numbers, but lest we find a contradiction in ZFC we cannot hope to constructively prove that it holds for any number.

    • To downvoters:

      I'm well aware that BB(748) is an integer definable in classical logic. My claim is that "integer definable in classical logic" does not actually correspond well to what people mean by "number" in almost any other setting when pushed to extremes such as this.

It's known that BB(14) is bigger than Graham's number, but this new finding leads me to believe that BB(7) is probably bigger than Graham's number. Intuitively, the technology required to go from pentation to Graham's number feels simpler than the technology required to go from `47,176,870` to `2 <pentate> 5`.

  • Thanks for sharing; your post would fit well as an answer to mine about Graham's number...

> Also, the left-superscript means tetration, or iterated exponentiation: for example, 1510 means 10 to the 10 to the 10 and so on 15 times.

I thought it was a typo. First time I encounter tetration.

  • Continuing the theme of iteration: it was the first time I encountered pentation

    • One of the reasons I like the use of the number line in schools is that on the line it's more obvious when you're shown addition and multiplication and then later exponeniation that this is a pattern. With the number line, two natural questions arise and, hopefully by the time you're taught exponentiation the Math teacher knows enough math to confidently affirm the answer to both. Yes, it keeps going like this forever, that's called Hyperoperation. And yes, we did (probably) skip one, it's known as Successor-of and you were probably not explicitly shown this operator but it's the near end of that infinite succession.

      When arithmetic is introduced just as a way to, for example, count money, it's more directly practical in the moment, but you're not seeing the larger pattern.

      2 replies →

> So I said, imagine you had 10,000,000sub10 grains of sand. Then you could … well, uh … you could fill about 10,000,000sub10 copies of the observable universe with that sand.

I don't get this part. Is it really rounding away the volume of the observable universe divided by the average volume of a grain of sand? That is many more orders of magnitude than the amount of mass in the universe, which is a more usual comparison.

  • Yes, that's right, dividing by that ratio essentially barely affects the number in a sense that 'adjacent' numbers in that notation give a much bigger change.

    10↑↑10,000,000 / (sand grains per universe) is vastly larger than, say, 10↑↑9,999,999

    So on system we're using to write these numbers, there's really no better way to write (very big)/ (only universally big) than by writing exactly that, and then in the notation for very big, it pretty much rounds to just (very big).

  • With tetration you're not dealing with orders of magnitude anymore, but orders of magnitude of orders of magnitude.

  • Here's a more common example of this sort of comparison:

    In significant figures, 1.0 billion minus 1.0 million equals 1.0 billion.

    • True but this is a ratio.

      However many universes in question, there is a qualitative difference between that many empty universes (with 1 grain), and that many completely packed with grain.

      Ask anybody who lives in one!

      2 replies →

  • Exactly. This number is so so much bigger than 10^100000 or however many grains of sand would fit, that dividing by that amount doesn’t really change it, certainly not enough to bring it down closer to 9,999,999sub10

  • Yes, that's only some normal number amount of orders of magnitude. Even 10,000,000^10,000,000 is already so large that it doesnt matter, let alone after exponentiating _the exponent_ nine times more.

    • It's the other way around: we're talking about 10^(10^(10^(10^…))) (which is vastly bigger).

So what is the richest logic whose proofs can be enumerated with only a five state TM?

  • While that question depends on what you count as an 'enumeration', there's the related question of "What's the richest logic that cannot prove the halting status of all 5-state TMs?" That is, what's the richest logic that some 5-state TM's halting status is independent of?

    I've pondered that version of the question a bit, but I couldn't get very far due to my lack of expertise in first-order logic. What I do know is that Skelet #17 [0] is one of the toughest machines to prove non-halting on a mathematical level [1], so any theory sufficient to prove that Skelet #17 doesn't halt is likely sufficient to decide the rest of the 5-state machines.

    [0] https://bbchallenge.org/1RB---_0LC1RE_0LD1LC_1RA1LB_0RB0RA

    [1] https://arxiv.org/abs/2407.02426

  • That entirely depends on how you want to interpret a finite binary string as an enumeration of logic proofs?!

> For those tuning in from home, here BB(6) is the 6th Busy Beaver number, i.e. the maximum number of steps that a 6-state Turing machine with a {0,1} alphanet can take before halting, when run on an initially all-0 input tape.

Oh! Of course! That sure clears things up for this non-expert. This is clearly a hardcore blog for people who have been doing this kind of research for decades. Kind of awesome to stumble upon something so unapologetically dense and jargony and written for a very specific audience!

  • That should be enough for someone with an undergrad CS education to at least get a sense of what's going on if they haven't encountered the busy beaver problem before.

    Is it niche jargon, absolutely, but to say it's only accessible to people who have put in decades is selling yourself short.

    • Hmm, interesting. It’s been 30 years since my engineering degree (not CS) and I’d have to look up what a Turing machine is. I think I remember one professor briefly mentioned it as “This is something the CS majors care deeply about but nobody else in the industry does.” Where I was, the CS degree was essentially a math degree dressed up in a hoodie.

      1 reply →

  • The definition there is standard undergraduate computer science theory. Maybe not standard for software engineering though.

>imagine you had 10,000,000_10 grains of sand. Then you could … well, uh … you could fill about 10,000,000_10 copies of the observable universe with that sand. I hope that helps people visualize it!

People can't visualize numbers that big. There's more ways to express numbers than just counting them. For example a single grain of sand has infinite states it can be in (there are an infinite amount of real numbers), so you could say a single grain of sand could represent BB(6). Combinations can grow exponentially, so that may be something useful to try and express it.

  • At some point big numbers become much more about the consistency strength of formal systems than “large quantities”.

    I.e., how well can a system fake being inconsistent before that fact it discovered? An inconsistent system faking consistency via BB(3) will be “found out” much quicker than a system faking consistency via BB(6). (What I mean by faking consistency is claiming that all programs that run longer than BB(n) steps for some n never halt.)

  • If the universe rounds to the nearest Planck unit, then a grain of sand suddenly has not all that many states.

    Using infinite precision to make things seem tractable is sleight of hand in my book. Stick with integers when you're describing scale.

  • I'm confused about this example, isn't the count of grains of sand equal to the count of observable universes so it'd be a single grain of sand per universe?

    • The "about" does a lot of heavy lifting in this example. Dividing 10,000,000_10 by the number of grains that fit into one universe doesn't change it much. The 10,000,000 would get smaller somewhere in the deep depths of the decimal fraction.

I wonder if the visible universe is large enough to write down the exact value of BB(6).

  • If you treat the observable universe as a closed system, you could try to apply the Bekenstein bound using - R ≈ 46.5 billion light-years (radius of the observable universe) - E ≈ total mass-energy content of the observable universe

    The mass-energy includes ordinary matter, dark matter, and dark energy. Current estimates suggest the observable universe contains roughly 10^53 kg of mass-energy equivalent.

    Plugging these into S ≤ 2πER/ℏc gives someting on the order of 10^120 bits of maximum information content.

    S ≤ 2πER/ℏc

    S ≤ (2 × 3.141593 × 3.036e+71 × 4.399e+26)/(1.055e-34 × 299792458)

    S ≤ 2.654135e+124

    S ≤ 10^120

    So, no.

  • It definitely isn't. The amount of information you can store in the universe is something like 10^120 bits. Even if I'm off by a trillion orders of magnitude it doesn't matter.

  • Just the starting number in the article is ¹⁵10. That means it's 10^(¹⁴10). That means it has ¹⁴10 digits. So no, you can't.

  • You’re probably referring to a state where all parts of the complete representation exist at the same time. Because if they don’t have to exist at the same time, then it might be possible to “write it down” if the universe has unbounded duration (“might” because I don’t know how the heat death plays into that). However, “at the same” time isn’t well-defined in relativistic spacetime. The sibling comments are definitely right with respect to the reference frame implied by the CMB. But I’m wondering if it wouldn’t be possible to slice spacetime in a way that actually makes a representation possible “at the same time” in some reference frame?

  • It's not.

    • I want some easier to comprehend number for BB(6), in decimal notation. But it's such a massive number I would need to invent a new notation to express that. I love this new (to me) concept of tetration number representation. 10-million sub 10, what is the number?

      Look at 3 sub 10 = which is (10^(10^10)). So that is 10 to the power of 10 billion. In regular decimal notation, that is a "1" with 10 billion "0"s following it. It takes 10 gigabytes of ram to represent the number in decimal notation, naively.

      The number of atoms in the universe is only 10^80, or 1,000...000 (80 zeroes). 10-million sub 10 is so huge, how much ram to represent it.

      This example is from https://www.statisticshowto.com/tetration-function-simple-de...

Any time I see such results from computation complexity theory, I realize that any current zeitgeist of "super-intelligent AI are gods" is complete bullshit.

You can convert every atom of observable Universe into a substrate for supercomputer, you can harness energies of supermassive black holes to power it, but running a humble BB(6) to halting state would be forever out of its reach.

If you want to learn about actual Busy Beaver results, I suggest reading https://www.sligocki.com/ instead

Unlike Aaronson, he actually is on the forefront of Busy Beaver research, and is one of the people behind the https://bbchallenge.org website

  • >Unlike Aaronson, he actually is on the forefront of Busy Beaver research [...]

    Extremely bad ad hominem, I enjoyed Aaronson's read, nothing wrong with it.

    • Gently, seconding peer: that is not ad hominem :)

      Colloquially, I understand it's easy to think it means "saying something about someone that could be interpreted negatively" because that's the context it is read in it when it is used.

      The meaning is saying a logical argument is incorrect because of who wrote the argument.

      4 replies →

  • Maybe Scott isn't at the forefront of the research by some standards, but I still consider him a prominent figure in the field. Independence of ZFC, Busy Baver Frontier paper, "Who Can Name the Bigger Number?" essay. He did a lot to popularise the topic and posed some interesting ideas or conjectures (Beeping Busy Beavers for example).

  • https://www.sligocki.com/ hasn't posted since April, and the very first link on that blog is a link to... Scott Aaronson.

    • Could I bother you for some more info?

      I spent 5 minutes trying to verify any link in the post above links to Scott Aaronson, or mentions him, and found nothing. :\ (both the siglocki, and when I found nothing there, the busy beaver site)

      1 reply →