Implementing a tiny CPU rasterizer (2024)

5 days ago (lisyarus.github.io)

This is a great resource. Some others along the same lines:

TinyRenderer: https://news.ycombinator.com/item?id=46410210#46416135

I tried doing this in Python a bit ago. It did not go well and it really showed how SLOW Python really is.

Even with just an 1280x720 window, setting every pixel to a single color by setting a value in a byte array and then using a PyGame function to just give it a full frame to draw, I maxed out at like 10 fps. I tried so many things and simply could not get any faster.

> Triangles are easy to rasterize

sure, rasterizing triangle is not so hard, but.. you know, rasterizing rectangle is far far easier

  • Rasterizing triangles is a nightmare, especially if performance is a goal. One of the biggest issues is getting abutting triangles to render so you don't have overlapping pixels or gaps.

    I did this stuff for a living 30 years ago. Just this week I had Deep Think create a 3D engine with triangle rasterizer in 16-bit x86 for the original IBM XT.

    • It's fairly easy to get triangle rasterisation performant if you think about the problem hard enough.

      Here's an implementation I wrote for the PS3 SPU many moons ago: https://github.com/ralferoo/spugl/blob/master/pixelshaders/t...

      That does perspective correct texture mapping, and from a quick count of the instructions in the main loop is approximately 44 cycles per 8 pixels.

      The process of solving the half-line equation used also doesn't suffer from any overlapping pixel or gaps, as long as both points are the same and you use fixed point arithmetic.

      The key trick is to rework each line equation such that it's effectively x.dx+y.dy+C=0. You can then evaluate A=x.dx+y.dy+C at the top left of the square that encloses the triangle. Every pixel to the right, you can just add dx, and every pixel down, you can just add dy. The sign bit indicates whether the pixel is or isn't inside that side of the triangle, and you can and/or the 3 side's sign bits together to determine whether a pixel is inside or outside the triangle. (Whether to use and or or depends on how you've decided to interpret the sign bit)

      The calculation for the all the values consumed by the rasteriser (C,dx,dy) for all 3 sides of a triangle, given the 3 coordinates is here: https://github.com/ralferoo/spugl/blob/db6e22e18fdf3b4338390...

      Some of the explanations I wrote down while trying to understand Barycentric coordinates (from which this stuff kind of just falls out of), ended up here: https://github.com/ralferoo/spugl/blob/master/doc/ideas.txt

      (Apologies if my memory/terminology is a bit hazy on this - it was a very long time ago now!)

      IIRC in terms of performance, this software implementation filling a 720p screen with perspective-correct texture mapped triangles could hit 60Hz using only 1 of the the 7 SPUs, although they weren't overlapping so there was no overdraw. The biggest problem was actually saturating the memory bandwidth, because I wasn't caching the texture data as an unconditional DMA fetch from main memory always completed before the values were needed later in the loop.

      3 replies →

    • > One of the biggest issues is getting abutting triangles to render so you don't have overlapping pixels or gaps. > I did this stuff for a living 30 years ago.

      So you did CAD or something like that? Since that matters far less in games.

      1 reply →

With the discrete GPUs pricing themselves out of the consumer space, we may actually need to switch back to software rendering :)