Comment by tombert

1 month ago

I remember I got a little confused when I was first learning TLA+, because what you normally call "functions" are "operators" [1], and what you'd normally call "maps" or "lists" are called "functions".

It was odd to me, because it hadn't really occurred to me before that, given infinite memory (and within a mathematical framework), there's fundamentally not necessarily a difference between a "list" and a "function". With pure functions, you could in "theoretical-land", replace any "function" with an array, where the "argument" is replaced with an index.

And it makes sense to me now; TLA+ functions can be defined like maps or lists, but they can also be defined in terms of operations to create the values in the function. For example, you could define the first N factorials like

    Fact == <<1, 2, 6, 24, 120>>

or like this:

    Fact[n \in Nat] ==
        IF n = 0 THEN 1
        ELSE n * Fact[n - 1]

in both cases, if you wanted to get the factorial of 5, you'd call Fact[5], and that's on purpose because ultimately from TLA+'s perspective they are equivalent.

[1] At least sort of; they superficially look like functions but they're closer to hygienic macros.

21 comments

tombert

nimih 1 month ago

I remember having a similar sort of realization early in my career when trying to implement some horribly convoluted business logic in SQL (I no longer remember the actual details of what I was trying to do, just the epiphany which resulted; I think it had something to do with proration of insurance premiums and commissions): I realized that if I simply pre-computed the value of the function in question and shoved it into a table (requiring "only" a couple million rows or so), then I could use a join in place of function application, and be done with it. An obvious idea in retrospect, but the sudden dredging of the set-theoretic formulation of functions--that they are simply collections of tuples--from deep within my memory was certainly startling at the time.

bux93 1 month ago

BTW this is extremely common in life insurance systems, where premiums (provisions, surrender values, etc.) depend on formulas applied to mortality tables; these data themselves are simply tables for people from 0 to 100 years of age, so many formulas end up with only 100 possible outputs and are precomputed. (or 200 for combined probabilities, or gender-specific ones)
saghm 1 month ago

I've seen this as a "solution" to implementing a function for fibbonacci numbers. The array of all of the fibbonacci numbers that can fit into a 32-bit integer is not particularly large, so sticking it into a static local variable is easy to do.

eru 1 month ago

> It was odd to me, because it hadn't really occurred to me before that, given infinite memory (and within a mathematical framework), there's fundamentally not necessarily a difference between a "list" and a "function".

You don't even need infinite memory. If your function is over a limited domain like bool or u8 or an enum, very limited memory is enough.

However the big difference (in most languages) is that functions can take arbitrarily long. Array access either succeeds or fails quickly.

VorpalWay 1 month ago
> However the big difference (in most languages) is that functions can take arbitrarily long. Array access either succeeds or fails quickly.
For some definition of quick. Modern CPUs are usually bottlenecked by memory bandwidth and cache size. So a function that recomputes the value can often be quicker than a look up table, at least outside of microbenchmarks (since in microbenchmarks you won't have to compete with the rest of the code base about cache usage).
Of course this depends heavily of how expensive the function is, but it is worth having in mind that memory is not necessarily quicker than computing again. If you need to go to main memory, you have something on the order of a hundred ns that you could be recomputing the value in instead. Which at 2 GHz would translate to 200 clock cycles. What that means in terms of number of instructions depends on the instruction and number of execution units you can saturated in the CPU, if the CPU can predict and prefetch memory, branch prediction, etc. But RAM is really slow. Even with cache you are talking single digit ns to tens of ns (depending on if it is L1, L2 or L3).
- tombert 1 month ago
  
  I've been watching those Kaze Emanuar videos on his N64 development, and it's always so weird to me when "doing the expensive computation again" is cheaper than "using the precomputed value". I'm not disputing it, he seems to have done a lot of research and testing confirming the results and I have no reason to think he's lying, but it's so utterly counter-intuitive to me.
  
  6 replies →
- eru 1 month ago
  
  > For some definition of quick. Modern CPUs are usually bottlenecked by memory bandwidth and cache size.
  I meant in most languages functions aren't guaranteed to return in finite time at all.
noelwelsh 1 month ago

It's a classic space / time trade-off. The special relativity of programming, if you like.

jwarden 1 month ago

One case where a function is often not substitutable for an array is equality testing. In a language where any two arrays with the same elements in the same order are equal ([1,2] == [1,2]), the same cannot always be true of two equivalent functions. That is because extensionally equality is undecidable for arbitrary functions.

Arrays and functions may be mathematically equivalent but on a programming language level they are practically different.

cionx 1 month ago
I don’t understand this argument. Just because functional extensionality is undecidable for arbitrary functions doesn’t mean that it is undecidable for every class of functions.
In the specific situation, let’s say that by an array we mean a finite, ordered list whose entries are indexed by the numbers 0, 1, …, n - 1 for some natural number n. Let’s also say that two arrays are equal if they have the same length and the same value at each position (in other words, they have “the same elements in the same order”).
If we now want to represent a function f as an array arr such that f(i) = arr[i] for every possible input i of f, then this will only be possible for some very specific functions: those whose domain are the set {0, 1, …, n - 1} for some natural number n. But for any two such functions f, g : {0, 1, …, n - 1} → t, their extensional equality is logically equivalent to the equality of the corresponding arrays: you really can check that f and g are extensionally equal by checking that they are represented by equal arrays.
- jwarden 1 month ago
  
  Right, so for a subset of functions, a language could implement an extensional equality test operator `==` for two functions by calling the functions for every possible input. It would be prohibitively slow for some functions, but correct.
  But for other functions, even that won't be possible.
  The point is that functions and arrays may be practically different. You can always do an `==` test on the contents of two arrays, but you can't do the same for two arbitrary functions.

trenchgun 1 month ago

Arrays/maps/lists are extensionally defined functions, where as functions/TLA+ operations are intensionally defined functions

dmead 1 month ago

Is this what the fp community calls referential transparency?

Jtsummers 1 month ago

Very similar, referential transparency is the ability to replace a function call (or expression, generally) with its result (or value). So using an array (or other tabling mechanism) you can take advantage of this to trade off space for time.

viraptor 1 month ago

Reminds me of many years ago when people were fascinated by the discussion about whether closures are objects or objects are closures. Yes... Yes they are.