Comment by accessvector

2 years ago

Out-of-bounds heap write happens in this function:

        int
        elf_read_pintable(struct proc *p, Elf_Phdr *pp, struct vnode *vp,
            Elf_Ehdr *eh, uint **pinp)
        {
         struct pinsyscalls {
          u_int offset;
          u_int sysno;
         } *syscalls = NULL;
         int i, npins = 0, nsyscalls;
         uint *pins = NULL;
        
    [1]  nsyscalls = pp->p_filesz / sizeof(*syscalls);
         if (pp->p_filesz != nsyscalls * sizeof(*syscalls))
          goto bad;
    [2]  syscalls = malloc(pp->p_filesz, M_PINSYSCALL, M_WAITOK);
    [3]  if (elf_read_from(p, vp, pp->p_offset, syscalls,
             pp->p_filesz) != 0) {
          goto bad;
         }
        
    [4]  for (i = 0; i < nsyscalls; i++)
    [5]   npins = MAX(npins, syscalls[i].sysno);
    [6]  npins = MAX(npins, SYS_kbind);  /* XXX see ld.so/loader.c */
    [7]  npins++;
        
    [8]  pins = mallocarray(npins, sizeof(int), M_PINSYSCALL, M_WAITOK|M_ZERO);
         for (i = 0; i < nsyscalls; i++) {
    [9]   if (pins[syscalls[i].sysno])
    [10]   pins[syscalls[i].sysno] = -1; /* duplicated */
          else
    [11]   pins[syscalls[i].sysno] = syscalls[i].offset;
         }
         pins[SYS_kbind] = -1;   /* XXX see ld.so/loader.c */
        
         *pinp = pins;
         pins = NULL;
        bad:
         free(syscalls, M_PINSYSCALL, nsyscalls * sizeof(*syscalls));
         free(pins, M_PINSYSCALL, npins * sizeof(uint));
         return npins;
        }

So first of all we calculate the number of syscalls in the pin section [1], allocate some memory for it [2] and read it in [3].

At [4], we want to figure out how big to make our pin array, so we loop over all of the syscall entries and record the largest we've seen so far [5]. (Note: the use of `MAX` here is fine since `sysno` is unsigned -- see near the top of the function).

With the maximum `sysno` found, we then crucially go on to clamp the value to `SYS_kbind` [6] and +1 at [7].

This clamped maximum value is used for the array allocation at [8].

We now loop through the syscall list again, but now take the unclamped `sysno` as the index into the array to read at [9] and write at [10] and [11]. This is essentially the vulnerability right here.

Through heap grooming, there's a good chance you could arrange for a useful structure to be placed within range of the write at [11] -- and `offset` is essentially an arbitrary value you can write. So it looks like it would be relatively easy to exploit.

8 comments

accessvector

accessvector 2 years ago

Re-reading this, my analysis is slightly incorrect: the `MAX` at [5] with an unsigned arg means we can make `npins` an arbitrary `int` using the loop at [4].

Choosing to make `npins` negative using that loop means we'll end up allocating an array of 87 (`SYS_kbind + 1`) `int`s at [8] and continue with the OOB accesses described.

You'd set up your `pinsyscall` entries like this:

    struct pinsyscall entries[] = {
        { .sysno = 0x1111, .offset = 0xdeadbeef }, /* first oob write */
        { .sysno = 0x2222, .offset = 0xf000f000 }, /* second oob write */
        { .sysno = 0xffffffff } /* sets npins to 0xffffffff so we under-allocate */
    };

`npins` would be `0xffffffff` after the loop and then the `MAX` at [6] would then return `86`, since `MAX(-1, 86) == 86`.

actionfromafar 2 years ago
I must misunderstand something very basic about this code.
if (pins[syscalls[i].sysno])
but pins is newly allocated and should just zero or "empty". Why dereference it right after allocation?
- accessvector 2 years ago
  
  Just to handle the case where the same syscall number is specified twice by the ELF header: in that case, the entry is set to -1 (presumably meaning it’s invalid).
  
  5 replies →