Comment by Joker_vD
10 months ago
Of course, if you cede RBP to be a frame pointer, you may as well have two stacks, one which is pointed into by RBP and stores the activation frames, and the other one which is pointed into by RSP and stores the return addresses only. At this point, you don't even need to "walk the stack" because the call stack is literally just a flat array of return addresses.
Why do we normally store the return addresses near to the local variables in the first place, again? There are so many downsides.
It simplifies storage management. A stack frame is a simple bump pointer which is always in cache and only one guard page for overflow, in your proposal you need two guard pages and double the stack manipulations and doubling the chance of a cache miss.
Yes, two guard pages are needed. No, the stack management stays the same: it's just "CALL func" at the call site, "SUB RBP, <frame_size>" at the prologue and "ADD RBP, <frame_size>; RET" at the epilogue. As for chances of a cache miss... probably, but I guess you also double them up when you enable CFET/Shadow Stack so eh.
In exchange, it becomes very difficult for the stack smashing to corrupt the return address.
The reduceron had five stacks and it was faster because of it.
Note the ‘shadow stacks’ CPU feature mentioned briefly in the article, though it’s more for security reasons. It’s pretty similar to what you describe.
Shadow stacks have been proposed as an alternative, although it's my understanding that in current CPUs they hold only a limited number of frames, like 16 or 32?
You may be thinking of the return stack buffer. The shadow stack holds every return address.
While here, why do we grow the stack the wrong way so misbehaved programs cause security issues? I know the reason of course, like so many things it last made sense 30 years ago, but the effects have been interesting.
You may be ready for Forth [1] ;-). Strangely, the Wikipedia article apparently doesn't put forward that Forth allows access both to the parameter and the return stack, which is a major feature of the model.
[1] https://en.wikipedia.org/wiki/Forth_(programming_language)
That does seem like a significant oversight. >r and r>, and cousins, are part of ANSI Forth, and I've never used a Forth which doesn't have them.
Forth has a parameter stack, return stack, vocabulary stack
STOIC, a variant of Forth, includes a file stack when loading words
I'm not sure what you're referring to with "vocabulary stack" here, perhaps the dictionary? More of a linked list, really a distinctive data structure of its own.
1 reply →
>Why do we normally store the return addresses near to the local variables in the first place, again? There are so many downsides.
The advantage of storing them elsewhere is not quite clear (unless you have hardware support for things like shadow stacks).
You'd have to argue that the cost of moving things to this other page and managing two pointers (where one is less powerful in the ISA) is meaningfully cheaper than the other equally effective mitigation of stack cookies/protectors which are already able to provide protection only where needed. There is no real security benefit to doing this over what we currently have with stack protectors since an arbitrary read/write will still lead to a CFI bypass.
> The advantage of storing them elsewhere is not quite clear (unless you have hardware support for things like shadow stacks).
The classic buffer overflow issue should spring immediately to mind. By having a separate return address stack it's far less vulnerable to corruption through overflowing your data structures. This stops a bunch of attacks which purposely put crafted return addresses into position that will jump the program to malicious code.
It's not a panacea, but generally keeping code pointers away from data structures is a good idea.