I've been working on my own web app DSL, with most of the typing done by Claude Code, eg,
GET /hello/:world
|> jq: `{ world: .params.world }`
|> handlebars: `<p>hello, {{world}}</p>`
describe "hello, world"
it "calls the route"
when calling GET /hello/world
then status is 200
and output equals `<p>hello, world</p>`
It is absolutely amazing that a solo developer (with a demanding job, kids, etc) with just some spare hours here and there can write all of this with the help of these tools.
Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.
Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.
Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.
In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.
Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?
I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!
No doubt these products will get better, the companies making them aren't just going to sit down and give up, but we're far from blindly trusting anything produced by AI, and not sure if we should ever.
It's a fun post, and I love language experiments with LLMs (I'm close to hitting the weekly limit of my Claude Max subscription because I have a near-constantly running session working on my Ruby compiler; Claude can fix -- albeit with messy code sometimes -- issues that requires complex tracing of backtraces with gdb, and fix complex parser interactions almost entirely unaided as long as it has a test suite to run).
But here's the Ruby version of one of the scripts:
BEGIN {
result = [1, 2, 3, 4, 5]
.filter {|x| x % 2 == 0 }
.map {|x| x * x}
.reduce {|acc,x| acc + x }
puts "Result: #{result}"
}
The point being that running a script with the "-n" switch un runs BEGIN/END blocks and puts an implicit "while gets ... end" around the rest. Adding "-a" auto-splits the line like awk. Adding "-p" also prints $_ at the end of each iteration.
So here's a more typical Awk-like experience:
ruby -pe '$_.upcase!' somefile.txt ($_ has the whole line)
Or:
ruby -F, -ane '$F[1]' # Extracts the second field field -F sets the default character to split on, and -a adds an implicit $F = $_.split.
That is not to detract from what he's doing because it's fun. But if your goal is just to use a better Awk, then Ruby is usually better Awk, and so, for that matter, is Perl, and for most things where an Awk script doesn't fit on the command line the only reason to really use Awk is that it is more likely to be available.
They have been able to write languages for two years now.
I think I was the first to write an LLM language and first to use LLMs to write a language with this project. (Right at ChatGPT launch, gpt-3.5
https://github.com/nbardy/SynesthesiaLisp
A related test i did around the beginning of the year: i came up with a simple stack-oriented language and asked an LLM to solve a simple problem (calculate the squared distance between two points, the coordinates of which are already in the stack) and had it figure out the details.
The part i found neat was that i used a local LLM (some quantized version of QwQ from around December or so i think) that had a thinking mode so i was able to follow the thought process. Since it was running locally (and it wasn't a MoE model) it was slow enough for me to follow it in realtime and i found fun watching the LLM trying to understand the language.
One other interesting part is the language description had a mistake but the LLM managed to figure things out anyway.
Here is the transcript, including a simple C interpreter for the language and a test for it at the end with the code the LLM produced:
Commendable effort, but I expected at least a demo, which would showcase working code (even if it’s hacky). It’s like someone talking about a sheet music without playing it once.
I've been trying to get LLMs to make Racket "hashlangs"† for years now, both for simple almost-lisps and for honest-to-god different languages, like C. It's definitely possible, raco has packages‡ for C, Python, J, Lua, etc.
Anyway so far I haven't been able to get any nice result from any of the obvious models, hopefully they're finally smart enough.
A few months ago I used ChatGPT to rewrite a bison based parser to recursive descent and was pretty surprised how well it held up - though I still needed to keep prompting the AI to fix things or add elements it skipped, and in the end I probably rewrote 20% of it because I wasn't happy with its strange use of C++ features making certain parts hard to follow.
Yes! This. It'd take so little effort to share, thereby validating your credibility, providing value, teaching,... it's so full of win I can't understand why so few people do this.
Yes! I'm currently using copilot + antigravity to implement a language with ergonomic syntax and semantics that lowers cleanly to machine code targeting multiple platforms, with a focus on safety, determinism, auditability and fail-fast bugs. It's more work than I thought but the LLMs are very capable.
I was dreaming of a JS to machine code, but then thought, why not just start from scratch and have what I want? It's a lot of fun.
I'm not the previous user, but I imagine that weeks of investment might be a commitment one does not have.
I have implemented an interpreter for a very basic stack-based language (you can imagine it being one of the simplest interpreters you can have) and it took me a lot of time and effort to have something solid and functional.
Thus I can absolutely relate to the idea of having an LLM who's seen many interpreters lay out the ground for you and make you play as quickly as possible with your ideas while procrastinating delving in details till necessary.
If I want to go from Bristol to Swindon, I could walk there in about 12 hours. It's totally possible to do it by foot. Or I could use a car and be there in an hour. There and back, with a full work day in-between done, in a day. Using the tool doesn't change what you can do, it speeds up getting the end result.
At least for me that fits. I have quite enough graduate-level knowledge of physics, math, and computer science to rarely be stumped by a research paper or anything an LLM spits out. That may get me scorn from those tested on those subjects. Yet, I'm still an effective ignoramus.
There are lots of different things people can find interesting. Some people love the typing of loops. Some people love the design of the architecture etc. That’s like saying ”how can you enjoy woodworking if you use a CNC machine to automate parts of it”
I take satisfaction in the end product of something. A product where I have created it myself, with my own skills and learnings.
If I haven't created it myself and yet still have an end product, how have I accomplished anything?
It's nice for a robot to produce it for you but you've really not gained other than a product your unknown too.
Coding has many aspects: conceptual understanding of problem domain, design, decomposition, etc, and then typing code, debugging. Can you imagine person might enjoy conceptual part more and skip over some typing exercises?
The whole blog post does not mention the word "grammar". As presented, it is examples based and the LLM spit out its plagiarized code and beat it into shape until the examples passed.
We do not know whether the implied grammar is conflict free. We don't know anything.
It certainly does not look like enjoying the conceptual part.
I've been working on my own web app DSL, with most of the typing done by Claude Code, eg,
Here's a WIP article about the DSL:
https://williamcotton.com/articles/introducing-web-pipe
And the DSL itself (written in Rust):
https://github.com/williamcotton/webpipe
And an LSP for the language:
https://github.com/williamcotton/webpipe-lsp
And of course my blog is built on top of Web Pipe:
https://github.com/williamcotton/williamcotton.com/blob/mast...
It is absolutely amazing that a solo developer (with a demanding job, kids, etc) with just some spare hours here and there can write all of this with the help of these tools.
[delayed]
Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.
Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.
It was a complete solution.
I've gotten Claude Code to port Ruby 3.4.7 to Cosmopolitan: https://github.com/jart/cosmopolitan
I kid you not. Took between a week and ten days. Cost about €10 . After that I became a firm convert.
I'm still getting my head around how incredible that is. I tell friends and family and they're like "ok, so?"
It seems like AIs work how non-programmers already thought computers worked.
1 reply →
I am incredibly curious how you did that. You just told it... Port ruby to cosmopolitan and let it crank out for a week? Or what did you do?
I'll use these tools, and at times they give good results. But I would not trust it to work that much on a problem by itself.
1 reply →
I've been surprised by how often Sonnet 4.5 writes working code the first try.
I've found it to depend on the phase of the moon.
It goes from genius to idiot and back a blink of an eye.
working, configurable via command-line arguments, nice to use, well modularized code.
Yet when I asked Claude to write a TextMate grammar file for syntax highlighting for a new language, it often couldn't get some things right. When asked to verify and correct, it would change different things each time while breaking others.
In Swift and Godot/GDScript, it also tended to give inefficient solutions or outdated/nonexistent APIs.
Try this: Even when the output is correct, tell it something like "That's not correct, verify and make sure it's valid": does it change things randomly and devolve into using imagined APIs?
I think coding-by-AI is still only good for things that you already know about, to just reduce typing time for boilerplate etc.; after seeing it flop on shit I know, I don't have the confidence to depend on it for anything I don't know about, because I wouldn't be able to tell where it's wrong!
No doubt these products will get better, the companies making them aren't just going to sit down and give up, but we're far from blindly trusting anything produced by AI, and not sure if we should ever.
1 reply →
It's a fun post, and I love language experiments with LLMs (I'm close to hitting the weekly limit of my Claude Max subscription because I have a near-constantly running session working on my Ruby compiler; Claude can fix -- albeit with messy code sometimes -- issues that requires complex tracing of backtraces with gdb, and fix complex parser interactions almost entirely unaided as long as it has a test suite to run).
But here's the Ruby version of one of the scripts:
The point being that running a script with the "-n" switch un runs BEGIN/END blocks and puts an implicit "while gets ... end" around the rest. Adding "-a" auto-splits the line like awk. Adding "-p" also prints $_ at the end of each iteration.
So here's a more typical Awk-like experience:
Or:
That is not to detract from what he's doing because it's fun. But if your goal is just to use a better Awk, then Ruby is usually better Awk, and so, for that matter, is Perl, and for most things where an Awk script doesn't fit on the command line the only reason to really use Awk is that it is more likely to be available.
So I have had to work very hard to use $80 worth of my $250 free Claude code credits. What am I doing wrong?
They have been able to write languages for two years now.
I think I was the first to write an LLM language and first to use LLMs to write a language with this project. (Right at ChatGPT launch, gpt-3.5 https://github.com/nbardy/SynesthesiaLisp
A related test i did around the beginning of the year: i came up with a simple stack-oriented language and asked an LLM to solve a simple problem (calculate the squared distance between two points, the coordinates of which are already in the stack) and had it figure out the details.
The part i found neat was that i used a local LLM (some quantized version of QwQ from around December or so i think) that had a thinking mode so i was able to follow the thought process. Since it was running locally (and it wasn't a MoE model) it was slow enough for me to follow it in realtime and i found fun watching the LLM trying to understand the language.
One other interesting part is the language description had a mistake but the LLM managed to figure things out anyway.
Here is the transcript, including a simple C interpreter for the language and a test for it at the end with the code the LLM produced:
https://app.filen.io/#/d/28cb8e0d-627a-405f-b836-489e4682822...
THANK YOU for SHARING YOUR WORK!!
So many commenters claim to have done things w/ AI, but don't share the prompts. Cool experiment, cooler that you shared it properly.
> I only interacted with the agent by telling it to implement a thing and write tests for it, and I only really reviewed the tests.
Did you also review the code that runs the tests?
Yes :)
I Cant Stand LLMs And Their Supporters
The money shot: https://github.com/Janiczek/fawk
Purely interpretive implementation of the kind you'd write in school, still, above and beyond anything I'd have any right to complain about.
Commendable effort, but I expected at least a demo, which would showcase working code (even if it’s hacky). It’s like someone talking about a sheet music without playing it once.
See https://github.com/Janiczek/fawk and .fawk files in https://github.com/Janiczek/fawk/tree/main/tests.
Even more, it's like talking about a sheet without seeing the sheet itself.
[dead]
I did AoC 2021 until D10 using awk, it was fun but not easy and couldn't proceed further: https://github.com/nusretipek/Advent-of-Code-2021
[dead]
[dead]
I've been trying to get LLMs to make Racket "hashlangs"† for years now, both for simple almost-lisps and for honest-to-god different languages, like C. It's definitely possible, raco has packages‡ for C, Python, J, Lua, etc.
Anyway so far I haven't been able to get any nice result from any of the obvious models, hopefully they're finally smart enough.
† https://williamjbowman.com/tmp/how-to-hashlang/
‡ https://pkgd.racket-lang.org/pkgn/search?tags=language
A few months ago I used ChatGPT to rewrite a bison based parser to recursive descent and was pretty surprised how well it held up - though I still needed to keep prompting the AI to fix things or add elements it skipped, and in the end I probably rewrote 20% of it because I wasn't happy with its strange use of C++ features making certain parts hard to follow.
> And it did it.
it would be nice when people do these things give us a transcript or recording of their dialog with the LLM so that more people can learn.
Yes! This. It'd take so little effort to share, thereby validating your credibility, providing value, teaching,... it's so full of win I can't understand why so few people do this.
I wrote two
jslike (acorn based parser)
https://github.com/artpar/jslike
https://www.npmjs.com/package/jslike
wang-lang ( i couldn't get ASI to work like javascript in this nearley based grammar )
https://www.npmjs.com/package/wang-lang
https://artpar.github.io/wang/playground.html
https://github.com/artpar/wang
Yes! I'm currently using copilot + antigravity to implement a language with ergonomic syntax and semantics that lowers cleanly to machine code targeting multiple platforms, with a focus on safety, determinism, auditability and fail-fast bugs. It's more work than I thought but the LLMs are very capable.
I was dreaming of a JS to machine code, but then thought, why not just start from scratch and have what I want? It's a lot of fun.
Curious why you do this with AI instead of just writing it yourself?
You should be able to whip up a Lexer, Parser and compiler with a couple weeks of time.
[delayed]
I'm not the previous user, but I imagine that weeks of investment might be a commitment one does not have.
I have implemented an interpreter for a very basic stack-based language (you can imagine it being one of the simplest interpreters you can have) and it took me a lot of time and effort to have something solid and functional.
Thus I can absolutely relate to the idea of having an LLM who's seen many interpreters lay out the ground for you and make you play as quickly as possible with your ideas while procrastinating delving in details till necessary.
Because he did it in a day, not a few weeks.
If I want to go from Bristol to Swindon, I could walk there in about 12 hours. It's totally possible to do it by foot. Or I could use a car and be there in an hour. There and back, with a full work day in-between done, in a day. Using the tool doesn't change what you can do, it speeds up getting the end result.
4 replies →
What's the point of making something like this if you don't get to deeply understand what your doing?
[delayed]
How deep do you need to know?
"Imagination is more important than knowledge."
At least for me that fits. I have quite enough graduate-level knowledge of physics, math, and computer science to rarely be stumped by a research paper or anything an LLM spits out. That may get me scorn from those tested on those subjects. Yet, I'm still an effective ignoramus.
I have made a lot of things using LLMs and I fully understood everything. It is doable.
What's the point of owning a car if you don't build it by hand yourself?
Anyway, all it will do is stop you being able to run as well as you used to be able to do when you had to go everywhere on foot.
5 replies →
So you are using a tool to help you write code because you dont enjoy coding in order to make a tool used for coding(a computer language). Why?
There are lots of different things people can find interesting. Some people love the typing of loops. Some people love the design of the architecture etc. That’s like saying ”how can you enjoy woodworking if you use a CNC machine to automate parts of it”
I take satisfaction in the end product of something. A product where I have created it myself, with my own skills and learnings. If I haven't created it myself and yet still have an end product, how have I accomplished anything?
It's nice for a robot to produce it for you but you've really not gained other than a product your unknown too.
Coding has many aspects: conceptual understanding of problem domain, design, decomposition, etc, and then typing code, debugging. Can you imagine person might enjoy conceptual part more and skip over some typing exercises?
The whole blog post does not mention the word "grammar". As presented, it is examples based and the LLM spit out its plagiarized code and beat it into shape until the examples passed.
We do not know whether the implied grammar is conflict free. We don't know anything.
It certainly does not look like enjoying the conceptual part.
For the same reason we have Advent of Code: for fun!
I mean, he's not solving the puzzles with AI. He's creating his own toy language to solve the puzzles in.
[dead]
This place has just become pro AI propaganda. Populism is coming for AI, both MAGA and the left.
https://www.bloomberg.com/news/articles/2025-11-19/how-the-p...
If it's just propaganda, it will fall of its own accord. If it's not, there's no stopping it.