← Back to context

Comment by dnautics

8 days ago

in the era of LLMs, syntax might matter more than you think.

The c form of `type name;` is ambiguous because it could actually be more than one thing depending on context. Even worse if you include macro sheananigans. The alternate (~rust/zig) is `var/const/mut name type` is unambiguous.

For humans, with rather long memory of what is going on in the codebase, this is ~"not a problem" for experts. But for an LLM, its knowledge is limited to the content that currently exists in your context, and conventions baked in with the training corpus, this matters. Of course it is ALSO a problem for humans if they are first looking at a codebase, and if the types are unusual.

I hope that someday LLMs will interact with code mostly via language servers, rather than reading the code itself (which both frequently confuses the LLM, as you've noted, but is also simply a waste of tokens).

  • why? I suspect that writing code itself is extremely token efficient (unless like your keywords happen to be silly, super-long alien text).

    Like which do you think is more token-efficient?

    1)

         <tool-call write_code "my_function(my_variable)"/>
    

    2)

        <tool-call available_functions/>
    
        resp: 
             <option> my_function </option>
             <option> your_function </option>
             <option> some_other_function </option>
             <option> kernel_function1 </option>
             <option> kernel_function2 </option>
             <option> imported_function1 </option>
             <option> imported_function2 </option>
             <option> ... </option>
         <tool-call write_function_call "my_function"/>
         resp:
             <option> my_variable </option>
             <option> other_variable_of_same_type </option>
         <tool-call write_variable "my_variable"/>

    • Not sure I follow. You seem to have omitted the part of 1) explaining how the LLM knew that my_function even existed - presumably, it read the entire file to discover that, which is way more input tokens than your hypothetical available_functions response.

      7 replies →

  • LSP is meant for IDEs and very deterministic calls. Its APIs are like this: give me a definition of <file> <row> <column> <lenght>. This makes sense for IDEs because all of those can be deterministically captures based of your cursor position.

    LLMs are notoriously bad at counting.

    • I think one could easily build an MCP tool wrapping LSP which smooths over those difficulties. What the LLM needs is just a structured way to say "perform this code change" and a structured way to ask things like "what's the definition of this function?" or "what functions are defined in this module?"

      Not much different from what agents already do today inside of their harnesses, just without the part where they have to read entire files to find the definition of one thing.

      1 reply →

Humans also have limited context. For LLMs it's mostly a question of pipeline engineering to pack the context and system prompt with the most relevant information, and allow tool use to properly understand the rest of the codebase. If done well I think they shouldn't have this particular issue. Current AI coding tools are mostly huge amounts of this pipeline innovation.

  • I think we need a LLM equivalent of this part's of fitt's law: The fastest place to click under a cursor is the location of the cursor. For an LLM the least context-expensive feedback is no feedback at all, the LLM should be able to intuit the correct code in-place, at token generation.

I suspect that the context-dependence in C is more an issue with the implementation than the overall syntactic philosophy.

  • It's a thing in c.

    foo * bar;

    Is that multiplication? Or a declaration of type foo*?

    • Right; and my argument is that this isn't because the type expression `foo *` precedes the name `bar`; it's because the type "pointer to foo" is expressed in a way that could also be a prefix of a multiplication expression.

      2 replies →

[dead]

  • Fine tuning is likely a bigger part of it.

    I've worked on fine tuning projects. There's a massive bias towards fone tuning for Python at several model providers for example, followed by JS.