← Back to context

Comment by dullcrisp

18 hours ago

Why/when are traps used rather than explicit system calls? Is it just historical coevolution? Or is the idea that the user mode program doesn’t need to know that it’s unprivileged? Or is it just repurposing the error handler path to perform privileged operations?

As others have said, a trap is one way to implement system calls. It's literally the TRAP mnemonic on a 68000.

https://www.nxp.com/docs/en/reference-manual/M68000PRM.pdf see page 292, also see page 629 for the table of "exception vectors" (addresses for code to handle each specific trap/exception/interrupt)

Most processors support both "interrupts" (an external peripheral is banging on the CPU's interrupt pins... but also invocable from software; software interrupts; SWIs; INT instruction on x86) and "exceptions" (e.g. divide by zero, bus error, illegal instruction). Depending on the processor, accessing the "privileged" mode can be done either by software interrupts, exceptions, or both. An operating system should pick one and stick with it.

Other uses for interrupt/exception/trap vectors include hardware breakpoints: don't try and single-step the CPU, overwrite the code with an illegal instruction and control will flow to the illegal instruction handler where you can see all the registers then execute the real instruction that was meant to be there and return to where you left off. Some CPUs have a formal "BKPT" type instruction for that.

One other use on the 68000 is that any unrecognised instruction that started $Fxxx triggered the F-line handler; all the floating point instructions were in the form $Fxxx, so if you didn't have an FPU, you could put a software emulator for the FPU instructions in the F-line handler and software wouldn't know the difference. Traps/exceptions don't have to be a jump from unprivileged to privileged, they can just be utilitarian.

  • > Depending on the processor, accessing the "privileged" mode can be done either by software interrupts, exceptions, or both. An operating system should pick one and stick with it.

    Practically most will support access via both, but for different reasons. For example, page faults (which the software cannot possibly predict) are going to be exception-mediated, but syscalls (which the software asks for) are triggered via an interrupt.

At the time of DOS, x86 didn't have multiple privileges. The system call instruction was typically INT, the software interrupt instruction.

Later on the 386 Intel added virtual 8086 mode which trapped to the kernel privileged instruction exception also for certain instructions that had to be virtualized, among them INT.

  • Yes, exactly.

    We used a set of INT instructions in well-known low memory addresses that all jumped to the same place. We had an ASM file that you linked with, that had sixteen different address combinations for each.

    The common entry point would look back on the stack and calculate from the return address which entry point had been called, and run the appropriate kernel call. We called it the CS:IP hack.

    In the context of this post, the DOS INT10 and INTx(I forget) required the caller to load registers with the desired system call number, then perform the trap instruction in their code. Fortunately CTOS didn't need those particular software interrupts, so I could implement them for my purposes.

    • Windows 95 used a related hack. Whenever a v8086 program asked to create a call to protected mode code ("please give me a real mode address to call to, in order to start executing the protected mode routine at address 0x123456"), Windows would store the entry point in a table and hand out real mode addresses like FFD0:0, FFCF:10, FFCE:20, FFCD:30, FFCC:40 that all point to the same instruction (because the segment part is shifted left by 4 in real or v8086 modes).

      The routine at 0xFFD00 could then enter protected mode and use the code segment to build the index into a table of entry points: FFD0 goes to index 0, FFCF goes to index 1, and so on. But for extra kicks, the address isn't actually pointing to valid code. It points to a random "c" character in the BIOS, which is an ARPL instruction - which in turn is invalid in v8086 mode and therefore invokes the undefined opcode exception handler. The exception handler, which handily enough is already running in protected mode, then takes care of doing the 32-bit call.

      Related: https://news.ycombinator.com/item?id=45283085

Depends on the processor architecture and its nomenclature.

Traps typically also result from exceptional conditions (like divide by zero or page fault).

An architecture may or may not provide non-trap paths for less-privileged code to invoke more-privileged subsystems (call gates, "syscall" instructions, etc.).

Traps typically need some way to preserve all userspace-accessible registers (otherwise resuming from a page fault is .. hard). Dedicated syscall instructions may only need to restore a subset of registers.

In some implementations, processors may discover that an instruction must trap after it starts irreversibly changing architecturally-visibile state; in cases like that, the processor needs to leave enough breadcrumbs for the OS to allow either a clean unwind or a resumption of the interrupted instruction. My understanding is that the original 68000 somewhat famously got this wrong.

Traps were/are the mechanism for doing syscalls.

I don't know OG x86 (cuz, ewww) but on 68k this was generally the way. On my Atari ST a syscall was performed by filling your registers and stack as expected, then executing one of the TRAP opcodes and that would get the CPU To save PC etc & jump to the handler but in supervisor mode, where your syscall could then read state perform accordingly, and then return back to you.

I think x86_64 has just formalized this into a specific SYSCALL instruction?

ARM variants call it SVC (supervisor call).

Same difference.

Some older operating systems just implemented their syscalls as ordinary subroutine jumps, though, and everything ran in supervisor etc. I believe AmigaOS was like this, you just went through a jump table. Which, I think, shaves some cycles but also means compromises in terms of building for memory protection, etc.