Comment by scaramanga

4 years ago

"it will open a 97 MB binary STL file in about 165 milliseconds flat, on a 2013 Macbook Pro. This is blinding fast."

This actually sounds incredibly slow, that's nearly 1/5th of an entire second. What can it possibly be doing? :)

In case anyone else was wondering, I followed the link and clicked the description and this is actually based on the time to the first frame being rendered - not just the time to load the file (which should be essentially infinitely fast). So it's actually more impressive than it sounds.

11 comments

scaramanga

esperent 4 years ago

From my experience creating loaders for 3D formats (FBX, glTF, LWO) it's not loading the file that takes a long time, it's parsing the data in the file and converting it to a suitable format for rendering in OpenGL. In practice, most people use the terms "parsing" and "loading" interchangeably, or "loading" means "reading + parsing file".

There can be a lot of processing involved (looking at you FBX) or less (glTF, probably STL) but there's still going to be at least decompressing the binary data and copying it into buffers then uploading those to the GPU. So, without knowing how STL binary is specified, parsing 97mb in 165ms seems reasonable.

Const-me 4 years ago

> What can it possibly be doing?

Possibly, two things.

1. Indexing the mesh. STL files don't contain meshes, they instead have a triangle soup. Indexed meshes are more efficient for rendering, they save VRAM bandwidth and vertex shaders.

2. Computing normals. STL files have per-triangle normals (can be complete garbage because most software ignores them), for smooth surfaces you want per-vertex normals. Computing them well (like I did there https://github.com/Const-me/Vrmac#3d-gpu-abstraction-layer ) is slow and complicated.

mkeeter 4 years ago

Touché – after all, disk-to-RAM is hundreds of MB/s, and faster if it's cached!

In practice, I'm racing mesh loading against "how long does the OS take to give you an OpenGL context", which is rarely below 160 ms (longer if it has to switch from integrated to discrete GPU).

scaramanga 4 years ago
Great data-point, I was wondering what the costs of all that would be.
How do you measure when the frame is done, are you waiting for a vsync?
- mkeeter 4 years ago
  
  It's self-reported by the logging system after glfwShowWindow() returns for the first time, so probably not 100% accurate, but reasonably close.
  A truly fancy system would be something like Tristan Hume's keyboard-to-photon latency system: https://thume.ca/2020/05/20/making-a-latency-tester/

titzer 4 years ago

Do anything with a file where you actually have to touch every byte (e.g. parsing), it is pretty impressive to get anything to go faster than 100MB/s. 500MB/s is blindly fast, IMHO.

scaramanga 4 years ago

Yeah, parsing text with a state machine is slow. Parsing, say, HTTP at that speed would be impressive without SIMD. But this is a binary file full of fixed sized structures, hence my confusion.
Anyway, the answer is there, it's actually measuring the performance of sending the mesh to openGL, to the GPU, and getting a frame rendered.

fsociety 4 years ago

How can loading a file be infinitely fast? There is latency to receive the first byte from loading the file.

scaramanga 4 years ago

When there is nothing to do. Like with an array of fixed sized structures all you need to know is how many there are and then you can increment a pointer past any number of those objects, and that pointer increment itself can probably be compiled away to nothing.
It depends on exactly what you're measuring, you see. If you are loading data in order to do something, then the "do something" part will incur 100% of all the costs in that case. So when you subtract the "do something" part to just be left with the "load it" part, you can end up with a meaningless zero-to-do/doing-it-infinitely-fast kind of a result.
So then, what would you measure? Just peak RAM throughput? You can get that from the data-sheets :)
gowld 4 years ago

Predict your access pattern and prefetch into RAM or DMA.

tomck 4 years ago

Is that really that slow? idk how they even read the file in that amount of time, my drive is only about 125MB/s