Comment by sanjayjc

4 months ago

> I think the aesthetic preference for terseness should give way to the preference for LLM accuracy, which may mean more verbose code

From what I understand, the terseness of array languages (Q builds on K) serves a practical purpose: all the code is visible at once, without the reader having to scroll or jump around. When reviewing an LLM's output, this is a quality I'd appreciate.

Perl and line noise also share these properties. Don’t particularly want to read straight binary zip files in a hex editor, though.

Human language has roughly, say, 36% encoding redundancy on purpose. (Or by Darwinian selection so ruthless we might as well call it "purpose".)

  • > Human language has roughly, say, 36% encoding redundancy on purpose.

    The purpose is being understandable by a person of average intellect and no specialized training. Compare with redundancy in math notation, for example.

    • > The purpose is being understandable by a person of average intellect and no specialized training.

      The purpose probably is keeping human speech understandable through the often noise-filled channel of ambient sound. Human speech with no redundancy would have a hard time fighting the noise floor.

I agree with you, though in the q world people tend to take it to the extreme, like packing a whole function into a single line rather than a single screen. Here's a ticker plant standard script from KX themselves; I personally find this density makes it harder to read, and when reading it I put it into my text editor and split semicolon-separated statements onto different lines: https://github.com/KxSystems/kdb-tick/blob/master/tick.q E.g. one challenge I've had was generating a magic square on a single line; for odd-size only, I wrote: ms:{{[(m;r;c);i]((.[m;(r;c);:;i],:),$[m[s:(r-1)mod n;d:(c+1) mod n:#:[m]];((r+1)mod n;c);(s;d)])}/[((x;x)#0;0;x div 2);1+!:[x*x]]0}; / but I don't think that's helping anyone

  • There's a difference between one line and short/terse/elegant.

      {m:(x,x)#til x*x; r:til[x]-x div 2; 2(flip r rotate')/m} 
    

    generates magic squares of odd size, and the method is much clearer. This isn't even golfed as the variables have been left.

    • Do you know which method is this? Euler's by any chance? And do you have an idea how one would prove that it creates a magic square? It's actually one of my inspirations for writing this, the relationship between the code that does something and the proof that the code actually does what it claims. I'd argue an LLM would find the proof helpful if it were asked to generalize an existing function in some way

  • When Q folks try to write C: https://github.com/kparc/ksimple

    • Representative example:

        //!malloc
        f(a,y(x+2,WS+=x;c*s=malloc(y);*s++=0;*s++=x;s))     //!< (a)llocate x bytes of memory for a vector of length x plus two extra bytes for preamble, set refcount to 0
                                                            //!< and vector length to x in the preamble, and return pointer to the 0'th element of a new vector \see a.h type system
        f(_a,WS-=nx;free(sx-2);0)                           //!< release memory allocated for vector x.
        G(m,(u)memcpy((c*)x,(c*)y,f))                       //!< (m)ove: x and y are pointers to source and destination, f is number of bytes to be copied from x to y.
                                                            //!< \note memcpy(3) assumes that x/y don't overlap in ram, which in k/simple they can't, but \see memmove(3)
        //!memory management
        f(r_,ax?x:(++rx,x))                                 //!< increment refcount: if x is an atom, return x. if x is a vector, increment its refcount and return x.
        f(_r,ax?x                                           //!< decrement refcount: if x is an atom, return x.
               :rx?(--rx,x)                                 //!<   if x is a vector and its refcount is greater than 0, decrement it and return x.
                  :_a(x))                                   //!<   if refcount is 0, release memory occupied by x and return 0.
      

      Reminds me a bit of both the IOCCC and 70s Unix C from before anyone knew how to write C in a comprehensible way. But the above is ostensibly production code and the file was last updated six months ago.

      Is there some kind of brain surgery you have to undergo when you accept the q license that damages the part of the brain that perceives beauty?

      2 replies →

  • I've been dabbling in programming language design as of late, when trying to decide if including feature 'X' makes sense or not, with readability being the main focus I realized some old wisdom:

    1 line should do 1 thing - that's something C has established, and I realized that putting conceptually different things on the same line destroys readability very quickly.

    For example if you write some code to check if the character is in a rectangular area, and then turn on a light when yes, you can put the bounds check expressions on the same line, and most people will be able to read the code quickly - but if you also put the resulting action there, your code readability will suffer massively - just try it with some code.

    That's why ternary expressions like a = condition? expr1: expr2 kinda controversial - they're not always bad, as they can encode logic about a single thing - if said character is friendly, turn the light color should be green, otherwise red - is a good example - but doing error handling there is not.

    I haven't been able to find any research that backs this up (didn't try very hard tho), but I strongly believe this to be true.

    A nice thing is that some other principles, like CQRS, can be derived from this, for example CQRS dictates that a function like checkCharacterInAreaThenSetLightState() is bad, and should be split up into checkCharacterInArea() and setLightState()

    • I'd perhaps generalize that to "it's useful to have visual grouping correlate with semantic grouping"; applies to separating words with spaces, sentences with punctuation, paragraphs with newlines, lines of code, UIs, and much more.

      An important question for this also is what qualifies for "single thing"; you can look at a "for (int i = 0; i < n; i++) sum += arr[i]" as like 5 or more things (separate parts), 2 things (loop, body), or just one thing ("sum"). What array languages enable is squeezing quite a bit into the space of "single thing" (though I personally don't go as far as the popular k examples of ultra-terseness).

      1 reply →

  • Hey, another language with smileys! Like haskell, which has (x :) (partial application of a binary operator)