This is essentially what any relu based neural network approximately looks like (smoother variants have replaced the original ramp function). AI, even LLMs, essentially reduce to a bunch of code like
let v0 = 0
let v1 = 0.40978399*(0.616*u + 0.291*v)
let v2 = if 0 > v1 then 0 else v1
let v3 = 0
let v4 = 0.377928*(0.261*u + 0.468*v)
let v5 = if 0 > v4 then 0 else v4...
Thats a bit far. Relu does check x>0 but thats just one non-linearity in the linear/non-linear sandwich that makes up universal function approximator theorem. Its more conplex than just x>0
The relu/if-then-else is in fact centrally important as it enables computations with complex control flow (or more exactly, conditional signal flow or gating) schemes (particularly as you add more layers).
Multiply-accumulate, then clamp negative values to zero. Every even-numbered variable is a weighted sum plus a bias (an affine transformation), and every odd-numbered variable is the ReLU gate (max(0, x)). Layer 2 feeds on the ReLU outputs of layer 1, and the final output is a plain linear combination of the last ReLU outputs
// inputs: u, v
// --- hidden layer 1 (3 neurons) ---
let v0 = 0.616*u + 0.291*v - 0.135
let v1 = if 0 > v0 then 0 else v0
let v2 = -0.482*u + 0.735*v + 0.044
let v3 = if 0 > v2 then 0 else v2
let v4 = 0.261*u - 0.553*v + 0.310
let v5 = if 0 > v4 then 0 else v4
// --- hidden layer 2 (2 neurons) ---
let v6 = 0.410*v1 - 0.378*v3 + 0.528*v5 + 0.091
let v7 = if 0 > v6 then 0 else v6
let v8 = -0.194*v1 + 0.617*v3 - 0.291*v5 - 0.058
let v9 = if 0 > v8 then 0 else v8
// --- output layer (binary classification) ---
let v10 = 0.739*v7 - 0.415*v9 + 0.022
// sigmoid squashing v10 into the range (0, 1)
let out = 1 / (1 + exp(-v10))
This is why I am programming now in Ocaml, files themselves are AI ( ml ).
I am sure you did not forget that pattern matching.
This is essentially what any relu based neural network approximately looks like (smoother variants have replaced the original ramp function). AI, even LLMs, essentially reduce to a bunch of code like
Thats a bit far. Relu does check x>0 but thats just one non-linearity in the linear/non-linear sandwich that makes up universal function approximator theorem. Its more conplex than just x>0
The relu/if-then-else is in fact centrally important as it enables computations with complex control flow (or more exactly, conditional signal flow or gating) schemes (particularly as you add more layers).
Multiply-accumulate, then clamp negative values to zero. Every even-numbered variable is a weighted sum plus a bias (an affine transformation), and every odd-numbered variable is the ReLU gate (max(0, x)). Layer 2 feeds on the ReLU outputs of layer 1, and the final output is a plain linear combination of the last ReLU outputs
1 reply →