← Back to context

Comment by porridgeraisin

14 hours ago

> You don't need any of that to get a basic MLP working using a for loop and naive gradient descent.

Well sure. Your initial statement was about "most applied ML".

> Rate of change -> it is flat -> that is not a useful signal. I don't see the issue?

It's not going to be zero if you sample in your practicum setting. You're gonna get RuntimeError: element 0 doesn't require grad and doesn't have a grad_fn. So yeah.