Comment by nutanc
1 year ago
This is true. This is the reason, in many of our experiments we find that using a new algorithm, KESieve, we actually find the planes much faster than the traditional deep learning training approaches. The premise is, a neaural network builds planes which separate the data and adjusts these planes through an iterative learning process. What if we can find a non iterative method which can draw these same planes. We have been trying this and so far we have been able to replace most network layers using this approach. haven't tried for transformers though yet.
Some links if interested:
[1] https://gpt3experiments.substack.com/p/understanding-neural-...
[2] https://gpt3experiments.substack.com/p/building-a-vector-dat...
No comments yet
Contribute on Hacker News ↗