Comment by SheinhardtWigCo
1 day ago
Society is about to pay a steep price for the software industry's cavalier attitude toward memory safety and control flow integrity.
1 day ago
Society is about to pay a steep price for the software industry's cavalier attitude toward memory safety and control flow integrity.
It's partly the industry and it's partly the failure of regulation. As Mario Wolczko, my old manager at Sun says, nothing will change until there are real legal consequences for software vulnerabilities.
That said, I have been arguing for 20+ years that we should have sunsetted unsafe languages and moved away from C/C++. The problem is that every systemsy language that comes along gets seduced by having a big market share and eventually ends up an application language.
I do hope we make progress with Rust. I might disagree as a language designer and systems person about a number of things, but it's well past time that we stop listening to C++ diehards about how memory safety is coming any day now.
I think society is going to start paying the price for humans being human. As the paper points out there is a lot of good faith, serious software that has vulnerabilities. These aren't projects you would characterize as people being cavalier. It is simply beyond the limits of humans to create vulnerability-free software of high complexity. That's why high reliability software depends on extreme simplicity and strict tools.
100%, poorly architected software is really difficult to make secure. I think this will extend to AI as well. It will just dial up the complexity of the code until bugs and vulnerabilities start creeping in.
At some point, people will have to decide to stop the complexity creep and try to produce minimal software.
For any complex project with 100k+ lines of code, the probability that it has some vulnerabilities is very high. It doesn't fit into LLM context windows and there aren't enough attention heads to attend to every relevant part. On the other hand, for a codebase which is under 1000 lines, you can be much more confident that the LLM didn't miss anything.
Also, the approach of feeding the entire codebase to an LLM in parts isn't going to work reliably because vulnerabilities often involve interactions between different parts of the code. Both parts of the code may look fine if considered independently but together they create a vulnerability.
Good architecture is critical now because you really need to be able to have the entire relevant context inside the LLM context window... When considering the totality of all software, this can only be achieved through an architecture which adheres to high cohesion and loose coupling principles.
I'm not even talking about poorly architected software. They are finding vulnerabilities in incredibly well-engineered software. The Linux kernel is complex not because it's poorly written. It's complex because of all the things it needs to do. Rhat makes it beyond the ability of a human to comprehend and reliably work with it.
1 reply →
> It doesn't fit into LLM context windows and there aren't enough attention heads to attend to every relevant part.
That's for one pass. And that pass can produce a summary of what the code does.
1 reply →
> These aren't projects you would characterize as people being cavalier.
I probably would. You mentioned the linux kernel, which I think is a perfect example of software that has had a ridiculous, perhaps worst-in-class attitude towards security.
Thank god, finally someone said it.
I don't know the first thing about cybersecurity, but in my experience all these sandbox-break RCEs involve a step of highjacking the control flow.
There were attempts to prevent various flavors of this, but imo, as long as dynamic branches exist in some form, like dlsym(), function pointers, or vtables, we will not be rid of this class of exploit entirely.
The latter one is the most concerning, as this kind of dynamic branching is the bread and butter of OOP languages, I'm not even sure you could write a nontrivial C++ program without it. Maybe Rust would be a help here? Could one practically write a large Rust program without any sort of branch to dynamic addresses? Static linking, and compile time polymorphism only?
Everybody has been saying this for the last 15 years.
We're going to have to put all the bad code into a Wasm sandbox.
[dead]