Comment by brundolf

3 months ago

I find this type of problem is what current AI is best at: where the actual logic isn't very hard, but it requires pulling together and assimilating a huge amount of fuzzy, known information from various sources

They are, after all, information-digesters

71 comments

brundolf

fire_lake 3 months ago

Which also fits with how it performs at software engineering (in my experience). Great at boilerplate code, tests, simple tutorials, common puzzles but bad at novel and complex things.

spaceman_2020 2 months ago
This is also why I buy the apocalyptic headlines about AI replacing white collar labor - most white collar employment is mostly creating the same things (a CRUD app, a landing page, a business plan) with a few custom changes
Not a lot of labor is actually engaged in creating novel things.
The marketing plan for your small business is going to be the same as the marketing plan for every other small business with some changes based on your current situation. There’s no “novel” element in 95% of cases.
- coffeebeqn 2 months ago
  
  I don’t know if most software engineers build toy CRUD apps all day? I have found the state of the art models to be almost completely useless in a real large codebase. Tried Claude and Gemini latest since the company provides them but they couldn’t even write tests that pass after over a day of trying
  
  14 replies →
- oceanplexian 2 months ago
  
  I agree but the reason it won’t be an apocalypse is the same reason economists get most things wrong, it’s not an efficient market.
  Relatively speaking we live in a bubble, there are still broad swaths of the economy that operate with pen and paper. Another broad swath that migrated off 1980s era AS/400 in the last few years. Even if we had ASI available literally today (And we don’t) I’d give it 20-30 years until the guy that operates your corner market or the local auto repair shop has any use in the world for it.
  
  2 replies →
- econ 2 months ago
  
  I wonder what the impact will be when replicating the same thing becomes machine readable with near 100% accuracy.
jdiff 3 months ago
Definitely matches my experience as well. I've been working away on a very quirky, non-idiomatic 3D codebase, and LLMs are a mixed bag there. Y is down, there's no perspective distortion or Z buffer, there are no meshes, it's a weird place.
It's still useful to save me from writing 12 variations of x1 = sin(r2) - cos(r1) while implementing some geometric formula, but absolutely awful at understanding how those fit into a deeply atypical environment. Also have to put blinders on it. Giving it too much context just throws it back in that typical 3D rut and has it trying to slip in perspective distortion again.
- josephg 2 months ago
  
  Yeah I have the same experience. I’ve done some work on novel realtime text collaboration algorithms. For optimisation, I use some somewhat bespoke data structures. (Eg I’m using an order-statistic tree storing substring lengths with internal run-length encoding in the leaf nodes).
  ChatGPT is pretty useless with this kind of code. I got it to help translate a run length encoded b-tree from rust to typescript. Even with a reference, it still introduced a bunch of new bugs. Some were very subtle.
  
  1 reply →
- westmeal 3 months ago
  
  I gotta ask what are you actually doing because it sure sounds funky
  
  1 reply →
brundolf 3 months ago
Yep. But wonderful at aggregating details from twelve different man pages to write a shell script I didn't even know was possible to write using the system utils
- HenryBemis 2 months ago
  
  Is it 'only' "aggregating details from twelve different man pages" or has it 'studied' (scraped) all (accessible) code in GitHub/GitLab/Stachexchange/etc. and any other publicly available coding repositories on the web (and for the case of MS the Git it owns)? Together with descriptions of what is right and what is wrong..
  I use it for code, and I only do fine tuning. When I want something that is clearly never done before, I 'talk' to it and train it on which method to use, and for a human brain some suggestions/instructions are clearly obvious (use an Integer and not a Double, or use Color not Weight). So I do 'teach' it as well when I use it.
  Now, I imagine that when 1 million people use LLMs to write code and fine tune it (the code), then we are inherently training the LLMs on how to write even better code.
  So it's not just "..different man pages.." but "the finest coding brains (excluding mine) to tweak and train it".
- fundingshovel 3 months ago
  
  I use it for this a lot.
- genewitch 3 months ago
  
  [flagged]
imatworkyo 3 months ago
how often are we truly writing actual novel programs that are complex in a way AI does not excel at?
There are many types of complex, and many times complex for a human coder, are trivial for AI and its skillset.
- gf000 3 months ago
  
  Depends on the field of development you do.
  CRUD backend app for a business in a common sector? It's mostly just connecting stuff together (though I would argue that an experienced dev with a good stack takes less time to write it as is than painstakingly explaining it to an LLM in an inexact human language).
  Some R&D stuff, or even debugging any kind of code? It's almost useless, as it would require deep reasoning, where these models absolutely break down.
  
  4 replies →
jeswin 3 months ago
> novel and complex things
a) What's an example?
b) Is 90% (or more) of programming mundane, and not really novel?
- nurettin 3 months ago
  
  If you'd like a creative waste of time, make it implement any novel algorithm that mixes the idea of X with Y. It will fail miserably, double down on the failure and hard troll you, run out of context and leave you questioning why you even pay for this thing. And it is not something that can be fixed with more specific training.
  
  10 replies →

_heimdall 3 months ago

I've been surprised that so much focus was put on generative uses for LLMs and similar ML tools. It seems to me like they have a way better chance of being useful when tasked with interpreting given information rather than generating something meant to appear new.

simonw 3 months ago
Yeah, the "generative" in "generative AI" gives a little bit of a false impression. I like Laurie Voss's take on this: https://seldo.com/posts/what-ive-learned-about-writing-ai-ap...
> Is what you're doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it's probably going to be great at it. If you're asking it to convert into a roughly equal amount of text it will be so-so. If you're asking it to create more text than you gave it, forget about it.
- xnx 2 months ago
  
  This quote sounds clever, but is very different than my experience.
  I have been very pleased with responses to things like: "explain x", "summarize y", "make up a parody dog about A to the tune of B", "create a single page app that does abc".
  The response is 1000x more text than the prompt.
  
  1 reply →
- _heimdall 3 months ago
  
  I've had coworkers tell me it works Copilot works well for refactoring code, which also makes sense in the same vein.
  Its like they wouldn't be so controversial if they didn't decide to market it as "generative" or "AI"...I assume fund raising valuations would move inline with the level of controversy though.

brk 3 months ago

FWIW, I do a lot of talks about AI in the physical security domain and this is how I often describe AI, at least in terms of what is available today. Compared to humans, AI is not very smart, but it is tireless and able to recall data with essentially perfect accuracy.

It is easy to mistake the speed, accuracy, and scope of training data for "intelligence", but it's really just more like a tireless 5th grader.

simonw 3 months ago

Something I have found quite amusing about LLMs is that they are computers that don't have perfect recall - unlike every other computer for the past 60+ years.
That is finally starting to change now that they have reliable(ish) search tools and are getting better at using them.

is-is-odd 3 months ago

it's just all compression?

always has been

i_have_an_idea 3 months ago

“best where the actual logic isn’t very hard”?

yeah, well it’s also one of the top scorers on the Math olympiads

jdiff 3 months ago
My guess is that those questions are very typical and follow very normal patterns and use well established processes. Give it something weird and it'll continuously trip over itself.
My current project is nothing too bizarre, it's a 3D renderer. Well-trodden ground. But my project breaks a lot of core assumptions and common conventions, and so any LLM I try to introduce—Gemini 2.5 Pro, Claude 3.7 Thinking, o3—they all tangle themselves up between what's actually in the codebase and the strong pull of what's in the training data.
I tried layering on reminders and guidance in the prompting, but ultimately I just end up narrowing its view, limiting its insight, and removing even the context that this is a 3D renderer and not just pure geometry.
- Timwi 3 months ago
  
  > Give it something weird and it'll continuously trip over itself.
  And so will almost all humans. It's weird how people refuse to ascribe any human-level intelligence to it until it starts to compete with the world top elite.
  
  5 replies →
stickfigure 2 months ago

LLMs struggle with context windows, so as long as the problem can be solved in their small windows, they do great.
Humans neural networks are constantly being retrained, so their effective context window is huge. The LLM may be better at a complex, well specified 200 line python program, but the human brain is better at the 1M line real-world application. It takes some study though.

m3kw9 3 months ago

LLMs are like a knowledge aggregator. The reasoning models have potential to get creative usefully but I have yet to see evidence of it, like invent a novel scientific thing

inopinatus 2 months ago

Be that as it may, do not forget that in the pursuit of the most textually plausible output, gaps may be filled in for you.

The mistake, and it's a common one, is in using phrases like "the actual logic" to explain to ourselves what is happening.

skydhash 3 months ago

It takes a lot of energy to compress the data. And a lot to actually extract something sensible. While you could just just optimize the single problem you have quite easily.

yard2010 2 months ago

It's just a huge database with nothing except fuzzy search

aaron695 2 months ago

[dead]