Comment by maccard

3 months ago

I think you’re fixating on the very specific example. Imagine if instead of 2 + 2 it was multiplying arrays of large matrices. The compiler or runtime would be smart enough to figure out if it’s worth dispatching the parallelism or not for you. Basically auto vectorisation but for parallelism

7 comments

maccard

lazide 3 months ago

Notably - in most cases, there is no way the compiler can know which of these scenarios are going to happen at compile time.

At runtime, the CPU can figure it out though, eh?

maccard 3 months ago
I mean, theoretically it's possible. A super basic example would be if the data is known at compile time, it could be auto-parallelized, e.g.
int buf_size = 10000000; auto vec = make_large_array(buf_size); for (const auto& val : vec) { do_expensive_thing(val); }
this could clearly be parallelised. In a C++ world that doesn't exist, we can see that it's valid.
If I replace it with int buf_size = 10000000; cin >> buf_size; auto vec = make_large_array(buf_size); for (const auto& val : vec) { do_expensive_thing(val); }
the compiler could generate some code that looks like: if buf_size >= SOME_LARGE_THRESHOLD { DO_IN_PARALLEL } else { DO_SERIAL }
With some background logic for managing threads, etc. In a C++-style world where "control" is important it likely wouldn't fly, but if this was python...
arr_size = 10000000 buf = [None] * arr_size for x in buf: do_expensive_thing(x)
could be parallelised at compile time.
- lazide 3 months ago
  
  Which no one really does (data is generally provided at runtime). Which is why ‘super smart’ compilers kinda went no where eh?
  
  4 replies →