Comment by MontyCarloHall
3 days ago
A neural-GP could probably be trained with the same parallelization efficiency via consistent discretization of the input space. I think their absence owes more to the fact that discrete data (namely, text) has dominated AI applications. I imagine that neural-GPs could be extremely useful for scale-free interpolation of continuous data (e.g. images), or other non-autoregressive generative models (scale-free diffusion?)
Right, I think there are plenty of other approaches that surely scale just as easily or better. It's like you said, the (early) dominance of text data just artificially narrowed the approaches tried.