Comment by jiggawatts

3 days ago

I've been on this ride about three or four times over decades. Every new major wave of technology takes a surprisingly long time to be adopted, despite advantages that seem obvious to the evangelists.

I had the exact same experience with, for example, rolling out fully virtualized infrastructure (VMware ESXi) when that was a new concept.

The resistance was just incredible!

"That's not secure!" was the most common push-back, despite all evidence being that VM-level isolation combined with VLANs was much better isolation than huge consolidated servers running dozens of apps.

"It's slower!" was another common complaint, pointing at the 20% overheads that were the norm at the time (before CPU hardware offload features such as nested page tables). Sure, sure, in benchmarks, but in practice putting a small VM on a big host meant that it inherited the fast network and fibre adapters and hence could burst far above the performance you'd get from a low end "pizza box" with a pair of mechanical drives in a RAID10.

I see the same kind of naive, uninformed push-back against AI. And that's from people that are at least aware of it. I regularly talk to developers that have never even heard of tools like Codex, Gemini CLI, or whatever! This just hasn't percolated through the wider industry to the level that it has in Silicon Valley.

Speaking of security, the scenarios are oddly similar. Sure, prompt injection is a thing, but modern LLMs are vastly "more secure" in a certain sense than traditional solutions.

Consider Data Loss Prevention (DLP) policy engines. Most use nothing more than simple regular expression patterns looking for things like credit card numbers, social security numbers, etc... Similarly, there are policy engines that look for swearwords, internal project code names being sent to third-parties, etc...

All of those are trivially bypassed even by accident! Simply screenshot a spreadsheet and attach the PNG. Swear at the customer in a language other than English. Put spaces in between the characters in each s w e a r word. Whatever.

None of those tricks work against a modern AI. Even if you very carefully phrase a hurtful statement while avoiding the banned word list, the AI will know that's hurtful and flag it. Even if you use an obscure language. Even if you embed it into a meme picture. It doesn't matter, it'll flag it!

This is a true step change in capability.

It'll take a while for people to be dragged into the future, kicking and screaming the whole way there.

Would you trust an LLM to recognize a credit card number more reliably than a regular expression can?

  • You're not forced to use only an LLM for data loss prevention! You can combine it with regex. You can also feed the output of the regex matches to the LLM as extra "context".

    Similarly, I was just flipping through the SQL Server 2025 docs on vector indexes. One of their demos was a "hybrid" search that combined exact text match with semantic vector embedding proximity match.