Comment by dwohnitmok

3 months ago

> First, you can see that time_f only gives you upper bounds for the runtime (they say this in the book explicitly, but in a different context), because who is to say that you cannot find even "faster" equations.

I don't understand the relevance. I'm not asking for exact computation of steps. Big O is fine. And provisional on the current implementation would be the whole point.

> If Isabelle had time_f built into its kernel from the start (the script just runs invisibly to you inside the kernel), would you then say that there is a formal connection in Isabelle between f and its runtime complexity?

No. That would be a clear hack. And it would break a lot of stuff in Isabelle. For example, you would introduce a distinction between first-order and higher-order functions. E.g. how do you calculate the runtime of a function like `map` when you don't have a concrete function to do the substitution on? This kind of thing only works when you have the concrete syntax of a function, which is different from the entire rest of the way the kernel works. (BTW the way the authors of this particular book deal with `map` is they just hard-coded it by hand).

Another illustration of this problem is that you wouldn't be able to state or prove very natural theorems such as "the asymptotic runtime of a sort function based on pairwise comparison can't be less than O(n log(n))".

Just jamming it in the kernel mixes syntactic and semantic concerns together. Depending on how awkward it's done it could affect the soundness of the underlying logic.

E.g. you could just say "oh we'll just make a logical relation `T` that is filled in by the compiler as necessary so e.g. in `map` it'll just have a placeholder that can then get put in" but then what is `T` in the kernel? What's its type? Can you abstract over `T`? What's its runtime representation? Does it even have a runtime representation? What are its deductive rules? So and so forth. And the answers to all these questions are all linked together.

There's a reason why `time_f` approaches haven't been adopted. It's very easy to blow up your logical system accidentally. It's a subtle problem and why I was asking for production examples. There's been some efforts here such as RAML, but it's a thing where you need to find the right balance between a full-blown macro system and a pure denotational deductive system.

> Third, if you wanted to do it not axiomatically (which is what this is: it is an axiomatic approach to runtime complexity), but definitionally, then you have to come up with a lot of definitions

The problem isn't axiomatic vs definitional, the problem is denotational vs operational. Once you decide one way or the other then the rest comes after that. Right now you can't even write a morphism between an Isabelle function and a machine function, let alone prove it.

8 comments

dwohnitmok

auggierose 3 months ago

> I don't understand the relevance. I'm not asking for exact computation of steps. Big O is fine. And provisional on the current implementation would be the whole point.

What do you mean by "the current implementation"? Isabelle defines mathematical functions, not running code.

> No. That would be a clear hack. And it would break a lot of stuff in Isabelle.

You are thinking a little bit too simplistic here. Of course you have to do some work to integrate it properly. In particular, you would need to have to generate inequalities (upper bounds) as runtime theorems. Just as code generation (for example to Ocaml) is integrated with Isabelle, this approach could be integrated similarly, and I don't think there would be any danger to the logic. Note that for code generation, you also have to provide special theorems for higher-order functions. But it would definitely be interesting to work this out in detail.

> wouldn't be able to state or prove very natural theorems such as "the asymptotic runtime of a sort function based on pairwise comparison can't be less than O(n log(n))"

No, you wouldn't, because the approach only proves upper bounds. As stated before. So lower bounds on runtime complexity would not be natural at all in this framework.

> The problem isn't axiomatic vs definitional, the problem is denotational vs operational.

You have to explain what you mean by that. Of course you can write these morphisms, you need to define your machines and machine functions first though, which is a lot of work. Maybe start with just a Turing machine first. The upside of all this work: you could also prove theorems about lower bounds now.

dwohnitmok 3 months ago
> What do you mean by "the current implementation"? Isabelle defines mathematical functions, not running code.
This is the whole problem! The line gets blurry with `primrec`, `fun`, and `function` and indeed the book is treating Isabelle definitions as runnable code (and a lot of uses of Isabelle with code extraction do the same).
But precisely because Isabelle technically doesn't define running code it's really hard to make cost functions and relate them to the original Isabelle function.
E.g. here's another definition of `double`.
definition double :: "nat => nat" where "double n = (THE y. y + n = 3 * n)"
Perfectly reasonable in Isabelle. How do you generate the cost function for it? A reasonable answer is "well you don't, you only generate them for certain kinds of definitions, just like other lemmas in Isabelle," which then leads me to...
> Of course you have to do some work to integrate it properly. In particular, you would need to have to generate inequalities (upper bounds) as runtime theorems. Just as code generation (for example to Ocaml) is integrated with Isabelle, this approach could be integrated similarly, and I don't think there would be any danger to the logic.
Okay so how would you do this? Again the point here is to have some formal connection between the runtime cost function and the original function. This boils down to one of two ways in Isabelle (which technically are both the same, but they have meaningful ergonomic differenes). One is as a relation so e.g. a `T` such that
T_rel(f, time_f)
or as a function such that
T_fun(f) = time_f
If you introduce these, then in the former case, how do you prove `not(T_rel(f, time_f))`? In the latter case, what is `T_fun(T_fun)`?
> Of course you can write these morphisms, you need to define your machines and machine functions first though, which is a lot of work.
I meant writing them in Isabelle.
- auggierose 3 months ago
  
  > I meant writing them in Isabelle.
  That would also be Isabelle, but with an explicitly defined machine model as parameter of the functions of interest, which would also be phrased in terms of the machine parameter. But yes, let's ignore this possibility, it would not be pretty, for sure.
  > Okay so how would you do this?
  Ok, let's try to define some sort of built-in function RUNTIME.
  You cannot just define RUNTIME for arbitrary expressions, and proceed with the approach in the book, as for example RUNTIME(nil) = 0 would lead to a non-sensical RUNTIME(f x1 ... xn) = 0 whenever f x1 ... xn evaluates to nil, i.e. f x1 ... xn = nil.
  You would need to modify the type system so that applications of RUNTIME are properly typed, you cannot just give it type RUNTIME : ('a => 'b) => 'a => nat, as then RUNTIME : ('a => 'b => 'c) => 'a => nat would follow, but it should act in this case as if it had type RUNTIME : ('a => 'b => 'c) => 'a => 'b => 'nat (which it cannot actually have for the same reason).
  You also cannot define RUNTIME simply as described in the book via generating equations, as different sets of input equations (like for your double example) might lead to different runtime costs, and therefore inconsistency in the logic.
  So you can only generate theorems involving RUNTIME for certain cases, and the theorems you generate must be inequalities, not equalities.
  You can do that by implementing an oracle (or kernel function) that takes in a user-defined function f and an equation for f, and returns an inequality as a theorem for RUNTIME f ..., as otherwise explained in the book, using auxiliary functions ℰ and T. Because ℰ and T are extra-logical, you cannot return the inequality if it still contains ℰ or T after simplification.
  You will want to distinguish between user-defined functions, primitive functions for which you have built-in / configured upper bounds for ℰ and T, and functions where you cannot say (expressions involving such functions will not simplify enough, and thus block theorem generation).
  For example for the append example on page 8 of the (free draft of the) book, you would get inequalities
  RUNTIME append [] ys ≤ 1 RUNTIME append (x#xs) ys ≤ RUNTIME append xs ys + 1
  You will be able to prove what these generated inequalities allow you to prove, which are upper bounds on runtime complexity. Note that for practical purposes, these upper bounds only hold under the assumption that you actually select the best (or at least good enough) equations for evaluating the function, i.e. those equations which you used for deriving the upper bounds via the oracle. RUNTIME itself is obviously one of the functions for which you cannot say (you don't have equations for RUNTIME from which to derive anything via the oracle, just inequalities), so you should not be able to get any meaningful RUNTIME statements about RUNTIME itself.
  I am not sure if you can derive inconsistencies from this by considering "non-terminating" functions, but if so, then it should be possible to deal with this by making the range of RUNTIME not just ℕ, but ℕ ∪ {∞}, or Option nat.
  The next step would be to prove that this approach actually works, or to come up with examples of inconsistencies induced by it.
  Hope this helps!
  
  5 replies →