Comment by mickg10
1 year ago
I.e. ReLU is _piecewise_ linear. That discontinuity that separates the 2 pieces is precisely what makes it non linear. Which is what enables the actual universal approximation.
1 year ago
I.e. ReLU is _piecewise_ linear. That discontinuity that separates the 2 pieces is precisely what makes it non linear. Which is what enables the actual universal approximation.
Which is what I said two replies ago.
Followed by "in some sense it's [ReLU] still even MORE linear than tanh or sigmoid functions are". There's no way you misunderstood that sentence, or took it as my "definition" of linearity...so I guess you just wanted to reaffirm I was correct, again, so thanks.