Comment by 20k

1 day ago

What we really we need is some kind of more detailed spec language that doesn't have edge cases, where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement. It'd be super helpful to have something more formal with no ambiguity, especially because the english language tends to be pretty ambiguous in general which can result in spec problems

I also tend to find especially that there's a lot of cruft in human written spec languages - which makes them overly verbose once you really get into the details of how all of this works, so you could chop a lot of that out with a good spec language

I nominate that we call this completely novel, evolving discipline: 'programming'

20 comments

20k

dnmc 1 day ago

There are languages like Dafny that permit you to declare pre- and post-conditions for functions. Dafny in particular tries to automatically verify or disprove these claims with an SMT solver. It would be neat if LLMs could read a human-written contract and iterate on the implementation until it's provably correct. I imagine you'd have much higher confidence in the results using this technique, but I doubt that available models are trained appropriately for this use case.

throw1234567891 1 day ago

Ask it to do so, show it how, and it will do it.

IsTom 1 day ago

> What we really we need is some kind of more detailed spec language that doesn't have edge cases, where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement.

That's theorem provers and they're awful for anything of any reasonable complexity.

pyrale 20 hours ago

That's any programming language, really [1]. Any website contains millions of "proofs", not all of them are useful. Choosing what needs to be proven is hard. And the spectrum of languages/type systems and their usability as either is more explored nowadays than it used to be. If you don't likue coq, you can look for agda. If agda is too far for you, you can look for Haskell. If that's still impractical, there's rust or f#, etc... The tradeoff you have between "convenient for expressing proofs" and "convenient for programming" has many options.
[1]: https://www.youtube.com/watch?v=IOiZatlZtGU

tedd4u 14 hours ago

Check out Allium, posted here recently.

https://juxt.github.io/allium/

giancarlostoro 1 day ago

Hate it all you want, but XML is genuinely a good fit there, and Claude is apparently insanely good at working with XML prompts.

geraneum 1 day ago
I don’t know why, but I get this feeling whenever someone uses “insanely” or “shockingly” along with AI, I think they’re bot or are writing based on a guideline! No offense, btw, I’m not saying you’re a bot.
- rspeele 16 hours ago
  
  I'm prepared to excise the word "genuinely" from my vocabulary after working with Claude.
  One of my biggest fears with using AI at work is that I will subconsciously start talking and writing like a bot, despite making conscious efforts to do the opposite. Just like how when you read a lot of books by one author, their style infects your own writing style.
- giancarlostoro 21 hours ago
  
  You’re absolutely right!
  Kidding, nah no worries. I do worry people become overly paranoid of bots as time passes.
  
  1 reply →
SpicyLemonZest 1 day ago
Is Claude good at working with XML prompts, or is XML good at convincing users to write more Claude-able specs? I am intensely skeptical that you could write an XML document describing a nontrivial web application in full detail, but I could easily imagine someone who thinks they have to stripping out important details because they don't really map to XML.
- giancarlostoro 21 hours ago
  
  They train it with XML even the system prompts that Claude reads are formatted by it.
  
  1 reply →

ramoz 1 day ago

yes, am familiar with the "code is spec" trope.

Shame us all for moving away from something so perfect, precise, and that "doesn't have edge cases."

Hey - if you invent a programming language that can be used in such a way and create guaranteed deterministic behavior based on expressed desires as simple as natural language - ill pay a $200/m subscription for it.

20k 1 day ago
As people are discovering, natural language is insufficiently precise to be able to specify edge cases. Any language precise enough to be formally verified against is a programming language
- Npovview 11 hours ago
  
  One agent generates : Spec -> Code then
  Another agent: Code -> Inverted Spec
  then compare Spec and Inverted Spec.
  If there is a Gap, a Human fixes and clarifies the Gap.
  This is like Generator and Discriminator aspects of GAN models or Autoencoder models.
- ramoz 1 day ago
  
  we're going to end up speaking past each other - but generally I do agree with you and am not denouncing the importance of formal verification methods. I do think abstractions are going to dominate the human ux above them
giancarlostoro 1 day ago

XML will do it very well.

zombot 1 day ago

> where we describe exactly what we expect the generated code to do, and then formally verify that the now generated code matches the input spec requirement.

In ancient times we had tech to do exactly that: Programming languages and tests.

k3vinw 1 day ago

like declarative vs imperative?