Comment by throw1235435

1 month ago

I also believe this - I think it will probably just disrupt software engineering and any other digital medium with mass internet publication (i.e. things RLVR can use). For the short term future it seems to need a lot of data to train on, and no other profession has posted the same amount of verifiable material. The open source altruism has disrupted the profession in the end; just not in the way people first predicted. I don't think it will disrupt most knowledge work for a number of reasons. Most knowledge professions have "credentials' (i.e. gatekeeping) and they can see what is happening to SWE's and are acting accordingly. I'm hearing it firsthand at least locally in things like law, even accounting, etc. Society will ironically respect these professions more for doing so.

Any data, verifiability, rules of thumb, tests, etc are being kept secret. You pay for the result, but don't know the means.

3 comments

throw1235435

coffeebeqn 1 month ago

I mean law and accounting usually have a “right” answer that you can verify against. I can see a test data set being built for most professions. I’m sure open source helps with programming data but I doubt that’s even the majority of their training. If you have a company like Google you could collect data on decades of software work in all its dimensions from your workforce

District5524 1 month ago

It's not about invalidating your conclusion, but I'm not so sure about law having a right answer. At a very basic level, like hypothetical conduct used in basic legal training matrerials or MCQs, or in criminal/civil code based situations in well-abstracting Roman law-based jurisdictions, definitely. But the actual work, at least for most lawyers is to build on many layers of such abstractions to support your/client's viwepoint. And that level is already about persuasion of other people, not having the "right" legal argument or applying the most correct case found. And this part is not documented well, approaches changes a lot, even if law remains the same. Think of family law or law of succession - does not change much over centuries but every day, worldwide, millions of people spend huge amounts of money and energy on finding novel ways to turn those same paragraphs to their advantage and put their "loved" ones and relatives in a worse position.
throw1235435 1 month ago

Not really. I used to think more general with the first generation of LLM's but given all progress since o1 is RL based I'm thinking most disruption will happen in open productive domains and not closed domains. Speaking to people in these professions they don't think SWE's have any self respect and so in your example of law:
* Context is debatable/result isn't always clear: The way to interpret that/argue your case is different (i.e. you are paying for a service, not a product)
* Access to vast training data: Its very unlikely that they will train you and give you data to their practice especially as they are already in a union like structure/accreditation. Its like paying for a binary (a non-decompilable one) without source code (the result) rather than the source and the validation the practitioner used to get there.
* Variability of real world actors: There will be novel interpretations that invalidate the previous one as new context comes along.
* Velocity vs ability to make judgement: As a lawyer I prefer to be paid higher for less velocity since it means less judgement/less liability/less risk overall for myself and the industry. Why would I change that even at an individual level? Less problem of the commons here.
* Tolerance to failure is low: You can't iterate, get feedback and try again until "the tests pass" in a court room unlike "code on a text file". You need to have the right argument the first time. AI/ML generally only works where the end cost of failure is low (i.e can try again and again to iron out error terms/hallucinations). Its also why I'm skeptical AI will do much in the real economy even with robots soon - failure has bigger consequences in the real world ($$$, lives, etc).
* Self employment: There is no tension between say Google shareholders and its employees as per your example - especially for professions where you must trade in your own name. Why would I disrupt myself? The cost I charge is my profit.
TL;DR: Gatekeeping, changing context, and arms race behavior between participants/clients. Unfortunately I do think software, art, videos, translation, etc are unique in that there's numerous examples online and has the property "if I don't like it just re-roll" -> to me RLVR isn't that efficient - it needs volumes of data to build its view. Software sadly for us SWE's is the perfect domain for this; and we as practitioners of it made it that way through things like open source, TDD, etc and giving it away free on public platforms in numerous quantities.