Comment by kannanvijayan
7 hours ago
I read through the article, and I'm not sure this is dependent on quadratic scaling.
Are they allowing all oscillators to influence all others, or are they picking modalities where the influences can be limited to some maximal fixed degree?
One would imagine that there'd be a variety of different topologies available to explore. Even if during training the treatment was fully connected, one could imagine the training itself biasing towards a maximal fixed degree per oscillator, and then inference later operating on a quantized version of that that drops the low-weight influences to zero.
No comments yet
Contribute on Hacker News ↗