Implementing a tiny CPU rasterizer (2024)

13 days ago (lisyarus.github.io)

This is a great resource. Some others along the same lines:

TinyRenderer: https://news.ycombinator.com/item?id=46410210#46416135

  • May I add Computer Graphics From Scratch, which covers both rasterization and raytracing? https://gabrielgambetta.com/computer-graphics-from-scratch/i...

    I have to admit I'm quite surprised by how eerily similar this website feels to my book. The chapter structure, the sequencing of the concepts, the examples and diagrams, even the "why" section (mine https://gabrielgambetta.com/computer-graphics-from-scratch/0... - theirs https://lisyarus.github.io/blog/posts/implementing-a-tiny-cp...)

    I don't know what to make of this. Maybe there's nothing to it. But I feel uneasy :(

    • Ah yes, great book; thanks for pointing it out. Added to the list.

      As for similarity, I think the sections you've highlighted are broadly similar, but I can't detect any phrase-for-phrase copy-pasting that is typical of LLM or thesaurus find-replace. I feel that the topic layout and the motivations for any tutorial or course covering the same subject matter will eventually converge to the same broad ideas.

      The website's sequence of steps is also a bit different compared to your book's. And most telling, the code, diagrams, and maths in the website are all different (such assets are usually an instant giveaway of plagiarism). You've got pseudocode; the website uses the C++ standard library to a great extent.

      If it were me, I might rest a little easier :)

      1 reply →

    • its a standard pipeline, everything from everyone will look roughly similar. your book likely looks something like previous work. i wouldnt worry about it, ps i really loved your web tutorials back in the day

  • Thanks for mentioning pikuma. :-)

    The 3D software rendering is still the most popular lecture from our school even after all these years. And it really surprises me because we "spend a lot of time" talking about some old techniques (MS-DOS, Amiga, ST, Archimedes, etc.). But it's fun to see how much doing things manually help students understand the math and the data movement that the GPU helps automate and vectorize in modern systems.

I tried doing this in Python a bit ago. It did not go well and it really showed how SLOW Python really is.

Even with just an 1280x720 window, setting every pixel to a single color by setting a value in a byte array and then using a PyGame function to just give it a full frame to draw, I maxed out at like 10 fps. I tried so many things and simply could not get any faster.

> Triangles are easy to rasterize

sure, rasterizing triangle is not so hard, but.. you know, rasterizing rectangle is far far easier

  • Rasterizing triangles is a nightmare, especially if performance is a goal. One of the biggest issues is getting abutting triangles to render so you don't have overlapping pixels or gaps.

    I did this stuff for a living 30 years ago. Just this week I had Deep Think create a 3D engine with triangle rasterizer in 16-bit x86 for the original IBM XT.

    • It's fairly easy to get triangle rasterisation performant if you think about the problem hard enough.

      Here's an implementation I wrote for the PS3 SPU many moons ago: https://github.com/ralferoo/spugl/blob/master/pixelshaders/t...

      That does perspective correct texture mapping, and from a quick count of the instructions in the main loop is approximately 44 cycles per 8 pixels.

      The process of solving the half-line equation used also doesn't suffer from any overlapping pixel or gaps, as long as both points are the same and you use fixed point arithmetic.

      The key trick is to rework each line equation such that it's effectively x.dx+y.dy+C=0. You can then evaluate A=x.dx+y.dy+C at the top left of the square that encloses the triangle. Every pixel to the right, you can just add dx, and every pixel down, you can just add dy. The sign bit indicates whether the pixel is or isn't inside that side of the triangle, and you can and/or the 3 side's sign bits together to determine whether a pixel is inside or outside the triangle. (Whether to use and or or depends on how you've decided to interpret the sign bit)

      The calculation for the all the values consumed by the rasteriser (C,dx,dy) for all 3 sides of a triangle, given the 3 coordinates is here: https://github.com/ralferoo/spugl/blob/db6e22e18fdf3b4338390...

      Some of the explanations I wrote down while trying to understand Barycentric coordinates (from which this stuff kind of just falls out of), ended up here: https://github.com/ralferoo/spugl/blob/master/doc/ideas.txt

      (Apologies if my memory/terminology is a bit hazy on this - it was a very long time ago now!)

      IIRC in terms of performance, this software implementation filling a 720p screen with perspective-correct texture mapped triangles could hit 60Hz using only 1 of the the 7 SPUs, although they weren't overlapping so there was no overdraw. The biggest problem was actually saturating the memory bandwidth, because I wasn't caching the texture data as an unconditional DMA fetch from main memory always completed before the values were needed later in the loop.

      5 replies →

    • As the other posters have shown it’s not that hard.

      Most graphics specs will explicitly say how tie break rules work.

      The key is to work in fixed point (16.8 or even 16.4 if you’re feeling spicy). It’s not “trivial” but in general you write it and it’s done. It’s not something you have to go back to over and over for weird bugs.

      Wide lines are a more fun case…

    • > One of the biggest issues is getting abutting triangles to render so you don't have overlapping pixels or gaps. > I did this stuff for a living 30 years ago.

      So you did CAD or something like that? Since that matters far less in games.

      2 replies →

With the discrete GPUs pricing themselves out of the consumer space, we may actually need to switch back to software rendering :)

  • That's too much of a stretch, but I believe games of the next era will be optimized more towards integrated GPUs (such as AMD's iGPU in the Steam Deck and Steam Machine).

    When hardware is priced out for most consumers (along with a global supply chain collapse due to tariffs and a potential Taiwan invasion), a new era awaits where performance optimization is going to be critical again for games. I expect existing game engines like Unity and Unreal Engine falling out because of all the performance issues they have, and maybe we can return to a temporary "wild west" era where everyone has their own hacky solution to cram stuff into limited hardware.

    • > everyone has their own hacky solution to cram stuff into limited hardware

      Limited hardware gave us a lot of classic titles and fundamental game mechanics.

      Off the top of my head:

      Metal Gear's stealth was born because they couldn't draw enough enemy sprites to make a shooting game. Instead they drew just a few and made you avoid them.

      Ico's and Silent Hill's foggy atmosphere is partially determined by their polygon budget. They didn't have the hardware to draw distant scenery so they hid it in fog.

    • That's wishful thinking. The reality is that your iGPU will be used to decode the video stream of an Unreal game running on a dedicated GPU on some cloud server which you pay a monthly subscription fee for.

I'm surprised more indie games don't use software rendering, just to get a more unique style. 640x400 aught to be enough for anybody!