Comment by mjhay

2 years ago

Kernelization can be done in primal or dual. Due to the representation theorem, it only ever needs as many parameters as data points. In the primal with a kernel K, you're just doing a feature expansion where each data point x corresponds to a feature whose value at each data point y is just K(x, y).