The largest number representable in 64 bits

3 years ago (googology.fandom.com)

Too many ads. Ads that are large. Ads that are videos. I am privileged enough to have a connection that can serve up the content with all of those ads. But I'm not privileged with the patience to read anything on that site.

If you don't want to host your own blog, consider putting it on github.

Even a single bit may be arbitrarily large, depending on the encoding. Encoding may be defined as a language, Turing machine, bb, etc. I can also define a single set bit as 10↑↑100 and beat your busy beaver. It's all arbitrary. A float is also an encoding, and with different mantissa size I may get a larger number.

Now if we count a number of distinct states 64 bits can take, it will still be 18446744073709551615, which I consider the only possible answer.

  • As the article points out though, there is nothing arbitrary about lambda calculus:

    > lack of cheating toward wanted results

TFA neglected to mention that floats have an "infinity" value, quite a bit larger than any 64 bit busy-beaver.

(And before anyone says it's not a number, call `isnan` with an infinity and get back to me :)

It all depends on what values you're willing to leave unrepresentable.

(Bearing in mind that the set of operations that you can accurately support in your number system also depends on the set of representable values.)

If you don't really care you can go arbitrarily high.

  • Very true, but in all fairness the proposed solution of using Turing machines is pretty reasonable. I think the title is a play on Scott Aaronson's "largest number" game.

    • Sure, but where do you stop?

      I think a more interesting question is "what's the largest number usefully representable in 64 bits?".

      1 reply →

  • Yeah, just define a 1-bit type of the values {0, MAX}, for some very large number MAX. And now we‘ve represented a very large number with a 1-bit value!

    • It's trivial in the extreme cases, but similar logic was used to implement the floating points - it compromised on the accuracy of results, but gave us a number system that's useful across a range of scales.

      If we could find some set of staggeringly large values, perhaps even infinite, with some useful set of operations that could be performed on them (and mapping back to the set) then we can come up with ridiculous answers that aren't necessarily useless.

It's interesting, trying to figure out the conceptual basis for the greater power-per-bit of BLC compared to classical TMs. I wonder if it simply has to do with how β-reduction can pack so many more copy-and-pastes into a byte than TM states can, especially relative to the normal form that we use to define BB_λ. At small sizes, classical TMs are forced to do weird Collatz-like tricks, whereas BLC gets nearly-instant exponentiation. Do you have any thoughts on this?

(Also, I've been thinking of how that BB_λ conjecture might be proven. One strategy I'm thinking of for sufficiently large n would be to create a compression scheme for TMs which omits many copies of machines and trivial machines that cannot contribute to BB_TM, to get past the naive 2n(2+log₂(n+1))-bit bound. Then, we create a BLC program with a decompressor and a TM interpreter, to which a compressed TM can be appended. But the compression would have to be good enough to get around the overhead of bit sequences not being natively expressible in BLM.)

  • Binary Lambda Calculus seems to spend its bits much more wisely than a Turing Machine encoding, working at a much higher level of abstraction. Defining and applying functions performs more useful work than moving a tape head around and jumping from state to state. The Turing Machine is also severely hampered by the pure sequential nature of the tape, where lambda calculus has access to all kinds of easily defined data structures.

    > One strategy I'm thinking of for sufficiently large n would be to create a compression scheme for TMs

    The universal variant https://oeis.org/A361211 of the lambda busy beaver can easily simulate any binary encoded TM with fixed overhead.

    • > Binary Lambda Calculus seems to spend its bits much more wisely than a Turing Machine encoding, working at a much higher level of abstraction. Defining and applying functions performs more useful work than moving a tape head around and jumping from state to state.

      That's what I mean by β-reduction being a more powerful operation: it can copy a term into arbitrary points in the program. (In the BB context, I like to think of BLC in terms of substitution rather than functions.) So I wonder if the comparison is somewhat biased, since applying a TM transition is logically simpler than applying a BTC β-reduction, which involves recursively parsing the current state, substituting, and reindexing.

      > The Turing Machine is also severely hampered by the pure sequential nature of the tape, where lambda calculus has access to all kinds of easily defined data structures.

      I'd say TMs have plenty of data structures, but most of the useful ones are weird and alien since the the control flow has to be encoded on the tape alongside the data, vs. BLC which can precisely direct where a term should be substituted. The real hamper IMO is the locality of the tape: a TM can't move a block of data across another block of data on the tape without wasting valuable states.

      > The universal variant https://oeis.org/A361211 of the lambda busy beaver can easily simulate any binary encoded TM with fixed overhead.

      Of course; the trick is to go from n + c bits to n bits.

  • One thing that comes to mind is function application, which begets recursion. This is the last quarter of the example program's encoding, while the example TM has enough space for 6 states and no concept of a "goto" or "jmp" that would simulate recursion.

    • On the contrary, 36 of its 60 bits are dedicated to goto labels. Although you're right that TMs cannot "call" a state then "return" to the caller, as would be expected from a reusable function. Regardless, for busy beaver purposes, most of TMs' state tends to be encoded in the tape rather than the state graph, so I suspect it's the differences in the means of data manipulation that create such a gap. (E.g., with substitution, you can easily move a term across an arbitrary amount of garbage.)

The intro got me thinking of MDL model selection. I.e. to express X you can choose a language L that can represent X, and rather than focusing on the conciseness of just L(X) (which for some powerful L might be a single bit) it's more fair to also take the length of the language itself into account.

Then this question would be rephrased as something along the lines of "what language would fit into 64 bits and leave enough enough bits to describe a huge value in that language? And which would represent the largest value?"

https://en.wikipedia.org/wiki/Minimum_description_length

  • That just begs the question: in what language do you describe the language L? In terms of features, the language I use was, together with combinatory logic, the first language ever proposed for formalising computation back in the 1930s, so it's about as non-arbitrary as can be...

I like how deceptively complicated this problem is. It rapidly goes from “oh this is obvious” to “oh I see it now” to “I need multiple cross disciplinary PhDs to understand the answer and it’s proof.”

One large number that fits in 64 bits is:

  F^^^^^^F = 15 ^ .... ^ 15
  12345678
  [8*8=64 bit]

According to Conway and Guy (1996) The Book of Numbers, p. 60, the arrow notation, defined by Knuth in (1976), is such that

  m^n is  m x m x ...x m,
  m^^n is m^m^ ...^m,
  m^^^n is m^^m^^ ... ^^m,

and so on, with n copies of m in each case, and the expression being evaluated from the right.

  • That is however inconceivably smaller than the number in the article, which exceeds Graham's number.

You only need a single bit to represent a number and say that number is infinity. But if you want to waste another 63 bits and be a bit more portable:

  std::numeric_limits<double>::infinity()

But then inf, e.g. calculated by 1/0 would be the largest number

  • You didn’t read the article.

    It asks for the largest terminating number.

    • Probably better than you. There is no such thing as a terminating number, and the article mentions "the shortest terminating program to produce the largest number whose output size exceeds Graham's number". Which should be an expression better than ack(9,9). But since you can emit just inf, you don't need a turing machine/lambda calculus at all.

      1 reply →