Designing a Language (2017)

18 hours ago (cs.lmu.edu)

I would like to see Raku (https://raku.org) at least mentioned in the list of languages to be aware of. Why?

  - Raku has built in Grammars so it is a great place to do early iteration of your parser
  - Raku is objects and type classes all the way down (as explained here https://gist.github.com/raiph/849a4a9d8875542fb86df2b2eda89296 )
  - RakuAST development is well advanced (use v6.e.PREVIEW) with the Slangify module to accelerate development of sub languages (Slangs)

Here is a Raku implementation of Brainfuck to whet the appetite https://github.com/alabamenhu/PolyglotBrainfuck/blob/main/li...

I think you should probably start by asking yourself if you should design a new language. Most new languages fall in the bucket of low value innovation that is instant tech debt for anyone who tries to use it for real

Even the successful ones are often pointless variations on a theme. Ruby, perl & python don't all need to exist for example, as they essentially do the same thing, about as poorly. Now python has won we should just drop the others

  • You've assumed there's only one reason for designing a language and based your opinion around that, which makes it shallow and not terribly convincing.

  • Terrible advice.

    Different languages excel at different things. There shouldn’t be a “one size fits all” otherwise we’d be writing software in FORTRAN and assembly.

    And designing a language is a good exercise if purely from an academic perspective. Eg you learn how to write parsers, and a bunch of PL theory that we take for granted when just being a consumer of a programming language.

    Not everything needs to be done with global domination in mind.

    • FYI:

      I started programming assembly in 2025 for 6592 and Z80 cpus and believe me: it is fun and IMO actually easier then lets say learning Haskell or JS from scratch.

      Assemblers with macros are amazingly simple.

      1 reply →

    • Building one for fun or to learn is great.

      The bad thing is the uncanny valley. Popular enough to fragment the niche and add tech debt, not big enough to win and defragment the niche, not innovative enough to make any real positive difference beyond personal tastes.

      5 replies →

    • You missed his point. He's not saying "why bother why a new language if this one is fine", he's saying "why bother with a very similar language if this one is fine".

      I think that's fair. Even if you are just doing a hobby language there are plenty of unexplored niches, e.g. that compile-to-shell language I've forgotten the name of.

      1 reply →

  • I learned a lot from deciding to do so. I doubt there is a situation where one should not consider doing it.

  • You should, just so you'll know how compilers and languages work. It doesn't have to be good.

I'm curious if there if any book or blogs that detail the design decisions, or the lack of, for some popular languages, from the perspective of language design and industry usage.

I could and have written a few toy interpreters, but I have no academic or industrial background (on the matter of language design), so it is useful to know why they put some features into a language, and why they don't. It is actually one of the most confusing parts of writing an interpreter for a toy language -- in all of my projects I simply pick a subset of an existing language I know about, e.g. Python or C.

  • For Python, you can read the PEP documents - Python Enhancement Proposals - and see the discussion of what was suggested, pros and cons, work done to determine the preferred implementation, and the final decision.

    https://peps.python.org/pep-0468/

    • It's pretty far back on my blog to-do list, but the key order guarantee that was solidified in 3.7 lost the opportunity for a further space optimization (for dicts where keys are frequently removed).

  • The Design and Evolution of C++ by Stroustrup is a fascinating book. It only covers the early years of C++, but that's perhaps what you're most interested in.

I've been playing around with interpreted variants of brainfuck for genetic programming experiments. The intended audience of the language is the evolutionary algorithm, not a human. The central goals are to minimize the size of the search space while still providing enough expressivity to avoid the Turing tarpit scenario (i.e., where we need an infeasible # of cycles to calculate a result).

I've recently found that moving from a linear memory model to a stack-based model creates a dramatic improvement in performance. The program tape is still linear, but the memory is a stack interface. It seems the search space is made prohibitively large by using pointer-based memory access. Stack based makes it a lot easier to stick arbitrary segments of programs together and have meaningful outcomes. Crossover of linear program tapes does not seem practical without constraining the memories in some way like this.

  • Hey! Have you come across the recent(ish) paper from Google researchers about self-replicators? In one of their experiments they used a self-modifying (metaprogrammable) variant of BrainFuck that I've found very interesting for EAs. I haven't fully replicated their findings as I've been experimenting with better ways to observe the evolution progress, but perhaps it might be interesting for your work as well.

In a similar vein is this 2003 post in an MIT discussion forum by Scott McKay [1].

I'd also highly recommend that anyone interested in this kind of thing listen to all three of the Dynamic Languages Wizards Series panels from 2001: runtime [2], language design [3], and compilation [4]

Note that though these are videos, there isn't that much compelling in the visual portion, you could easily rip them to audio files and lose little.

[1] https://libarynth.org/fifty_questions_for_a_prospective_lang...

[2] https://www.youtube.com/watch?v=4LG-RtcSYUQ

[3] https://www.youtube.com/watch?v=agw-wlHGi0E

[4] https://www.youtube.com/watch?v=at7viw2KXak

> Java and C#, for being enterprisey

I believe there are far more interesting stuff to learn about these languages, like the whole category of runtimes could have been mentioned, which can directly affect the language design itself (e.g. having GC vs some language feature for managing memory, open vs closed world model, having an async feature in the language or let the runtime handle it, etc)

That's nice summary of the space and how large it is. My recommendation is to just start with math expression parser and evaluator. You can start with Pratt but I would even recommend going with infix to reverse polish using stack.

Adding construct like IF or variables is naturally next step but you will have code in place and idea where to put it and how approach it.

I learned a lot about JVM runtime, how Zig is parsing itself, how Lua represents values... Too many good rabbit holes to fall in.

I’m waiting for a llm focused language. We’re already seeing AI is better with strongly typed languages. If we think about how an agent can ensure correctness as instructed by a human, as the priority, things could get interesting. Question is, will humans actually be able to make sense of it? Do we need to?

I started making a language, and I took many shortcuts.

I just parse my language, translate it to C, and use C compiler errors.

I don't add new semantics, I just add many things like strings, map, etc to make it usable and fast.

I don't know if it's a good idea and how difficult this will be.

  • You're in good company. The original C++ compiler translated to C. Haskell translates to C--.

    That sounds like a good approach, concentrate on the things you want to do/learn and let the C compiler pick up the rest. You can then be finished, add to your front end, or start replacing the backend.

    My language compiles to Javascript. I wanted to concentrate on the frontend tasks like type checking and elaboration, and I wanted a web playground (the language is now self-hosted). Javascript got me a runtime with closures and garbage collection for free.

  • Chicken Scheme compiles to C, using a method that ended up as a maths paper (Cheney on the MTA).

    Its a valid approach.

    • One day I aspire to be able to fully comprehend Cheney on the MTA. I kinda get it? But I've never learned C, and never had to slog through manual memory management, so it's a little lost on me

  • I am also doing this. Like you i guess I want a nicer C. I produce my own errors though because it's better for the user.

  • "I don't know if it's a good idea and how difficult this will be."

    It is a great idea, if you want to learn about languages!

    (But if money is your goal, you may want to reconsider)

  • Nothing wrong with that - some others do it too. You can even use TCC to do quick test builds and only use Clang/GCC for release builds.

I’m pretty sure that most language designs skip the “formally” part of the cycle suggested in TFA

And that’s probably a good thing.

Reading the headline my first thought was another kind of language : the linguistic language (English, Spanish, French, Esperanto etc.)

How does one create a new spoken/written language ?

Interesting page. The latest language I designed is an stack based intermediate language for a C compiler. Not realy intended for human usage, but readable in the sense that you can compare it with the original C code.

  • Ha! I just wrote stack based RISC cpu architecture with assembler and now thinking about implementing my own FORTH like lang (niche stack based programming language) compiler.

    Fun

    • Great. Are you going to open source it? If so, let me know. You find my email address on my website mentioned in my profile.

Nice summary, but in my experience with programming language design the macro usage issues loom large. What about base libraries, use of popular libraries, build tools, performance analysis, debugging, packaging and modularity, and so on. The core design matters and then cascades into all manner of differences.

What do these languages compile to? What's the build pipeline and runtime context?

  • What you mean with runtime context? In any case language design and implementation are distinct concepts, although they're usually running in parallel so you don't end up with a design that is unimplementable (e.g. BitC).

My understanding after reading many of such posts is the following:

1) You are NOT serious (in effort to be invested, resources, knowledge), then don't do it. 2) You are MEH serious, then probably design some DLC in Lua or similar, will serve your case 99%. 3) You ARE serious, then go for it. Chances are that you might even post it here one day, but also almost no one will ever use it apart from some crazy fans.

  • > 1) You are NOT serious (in effort to be invested, resources, knowledge), then don't do it.

    I did it while being non-serious. I got like a half of a language working. And I don't regret it. It was fun. I've got a little bored and distracted by other things, and so I've stopped working on it.

    Such posts are great, because they let you pick some new ideas that will be fun to code.

    > You ARE serious, then go for it.

    I don't think it works this way. To become serious you need some really good idea. But to get a really good idea you need to do at least a couple of full loops through the four phases the article begins with. Before you invested a lot of time into writing languages, you are highly unlikely can get a really good idea for a new language.

  • There is no harm in building a compiler and designing a language as a hobby. It is gratifying to build something and see it work, and it is often interesting to hear about other people’s projects.

    The problem comes when designers have delusions of grandeur about their language/compiler. There are lots of people like this on programming language forums who drive themselves nuts because they don’t realize that languages become popular due to platform exclusivity/marketing or due to word of mouth around a readily available implementation that offers something unique. Most hobby languages/compilers are not that different from existing ones so this rarely happens. And the people who create languages are rarely good at building communities because they usually lack social skills (and they tend to be a little manic/defensive about their creations).

I had some thoughts about designing a new language. However it's a huge undertaking and I don't know the answers to some questions:

1. Is there a need for the programming language?

2.If the answer to the previous question is yes, can I find enough people to help and enough resources?

3. If the answer to the previous question is yes, can we release a MVPin a reasonable amount of time?

4. If the answer to the previous question is yes, what is the chance it will gather a reasonable amount of users?

There are literally tons of programming languages that didn't make it. I wouldn't want to waste my and other people resources.

  • I made a language for using in another project, so I'll answer your questions

    https://www.npmjs.com/package/wang-lang

    - this new language looks and behaves exactly like javascript, except it doesnt have "eval" and "new Function", so it is CSP safe. That's the only difference. I wanted to execute dynamically generated code in chrome extension

    - llm did most of the work of creating a nearley grammar and associated interpreter (whole thing is bundled, nearley is not a final dependency), elaborate tests make this quite sane to handle

    - took me about total of 1 weeks for the initial mvp to try out, and then have been fixing bugs and inconsistencies with javascript behavior, about 1 day a month of effort

    - mostly 0

    The only reason to create was I couldnt find something similar and it was low effort thanks to llm

    I also created another even smaller DSL you can say

    https://www.npmjs.com/package/free-text-json-parser

    It parses json embedded in plain text

    • I once made a hacked version of javascript for work, starting with rhino. I adjusted it to make `.` and `[]` on null/undefined return undefined. Kind of like the `?.` in modern javascript, but it didn't exist back then. I was inspired by ObjectiveC's message send behavior.

      The language was for some configuration in a reporting system. The scripts were written by non-engineers, and the changes made the language more user friendly for them. I started from javascript because I expected it would be easier for them to find documentation.

    • Nice. I built something basically just like this for work for the same reason last year. It only look a few hours though, cause I just used Acorn [0] to parse my JS, then directly evaluated the AST. It also had an iteration limit and other configurable limits so I can eval stuff in the browser without crashing the tab. I did not use an LLM.

      [0]: https://github.com/acornjs/acorn

      1 reply →

  • I think most popular languages were started as an experiment in some feature, or to solve a specific problem someone had. Those are good reasons to make a language. I see no reason to make a language just to take attention away from other existing languages. Instead, make a language so you can understand how to make languages. It is 100% doable by one person. It's fun and educational.

  • Sometimes it might just be a fun project to push yourself. Maybe such a complex undertaking can't be fun indeed lol

    • My idea of for fun is to release something people will use. I have more fun if I work on something useful. For me is less the journey than the end goal.

      I love working on software, architecture, design but only if I see some use.

      Of course, for other people, the journey is more interesting than the destination and they have fun hacking stuff just for the sake of it. They discover things and learn new stuff they wouldn't have learned otherwise. And this is a path at least as valid as the other.

  • I'll try to answer your questions best I can.

    1. Yes, as long as there are new machines that need programming, new programming languages will be needed. Today's top languages were built for the machines of the 1970, 80s, and 90s. Tomorrow's languages will be built for machines of today and tomorrow. As Alan Kay put it, if you want to invent a new language, first invent the machine of the future and then build a language for it.

    2. No, you cannot. First of all, PL devs are cats, it's very difficult collecting them without financial compensation. So if your plan is to post a language and hope that people will come help you, you'll likely be disappointed. The problem is that everyone else interested in building PLs has their own itch to scratch, and they're not going to scratch yours without some compensation.

    You might think "Well I can just raise money to do this", and you would be wrong. First, it's very hard to raise money for PLs. Usually you have to have come sort of cred to do it. I know of only 3 projects to have raised VC money for a PL project, and they each had some success before they had done so: Chris Granger (Light Table), Paul Biggar (CircleCI), and Chris Lattner (Swift/LLVM). Granger's project Eve raised $2M and ran out of money after 3 years; Biggar's project Dark also raised money, then fired all the devs when he realized he was burning cash too fast, then he slow-burned development for years, then he gave up and handed development over to someone else; and Lattner raised almost $100M for Mojo, which is probably going to end much the same way as Eve and Dark, but I wish them the best.

    Anyway, the point is that you personally (no offense) don't have the profile to raise $100M like Lattner. $2M is not enough for a PL project. Lattner is keeping Mojo closed source for now because there's no good answer for how they're going to make enough money as an open source language to justify raising $100M.

    And the reason it's so hard to raise money is because there's no money to be made. No one pays for PLs. No one pays for PL dev tools. They have to be open source or they're rejected by the dev community. The only ones these days who can reasonably pay for all of this with no potential revenue stream are giant corporations, who use the lang as a hook into their ecosystem.

    3. Even though the answer is no, you yourself can still get an MVP off the ground in a pretty reasonable amount of time. It's never been easier to make a PL. The problem with PLs is building them is kind of like measuring the coastline; language projects are fractals -- there's an infinite amount of detail you can work on in any given direction. It's very easy for a language project to become a language + editor project, and it's easy for that to turn into language + editor + operating system if you're not disciplined. Plenty of PL devs have fallen into that trap.

    4. Rounds to 0% chance. You'll be lucky if you build something that even you will use. Rather, most PL devs end up working on their language in some other language, because working on languages is what they want to do!

    That said, it's still important to write languages that you understand no one will use. First it allows you to try new things that may good but unpopular. If PL devs only did what was popular with devs, PLs would go nowhere as a field.

    Consider the so called "Hornet's nest" of programming languages [1], which is the tightly related cluster of imperative programming languages which have been the most researched and used over the last 50 years. There is a vaaaaaaaast design space outside that nest, begging for more language development. No one will use most of them, but it's important to understand what those languages might look like to maybe find some new ideas that work.

    Also "didn't make it" is kind of an unfair judgement. Gaining popularity doesn't have to be a goal. In fact, it shouldn't be a goal if you want to have any fun at all. There's an infinite amount of work to be done, and if you're not doing it for you, you won't get far at all. That's really the only way to fail at this.

    Good luck!

    [1] https://tomasp.net/techdims/#footer=index,navigation;left=ca...

    • Bold plus, making PLs is a lifestyle, not a business. Most PLs clones each other and absorb features. The only difference is QOL and tooling. Users expect to have a full set of batteries, an IDE/LSP, jobs, OOP style, and minimal effort to learn. Being popular contradicts with the idea of pushing the boundaries and shifting paradigms.

      1 reply →

[flagged]