Comment by vlovich123
4 hours ago
While you are correct at a higher level, comparing bigrams/trigrams would be less compute not more because there’s fewer of them in a given text
4 hours ago
While you are correct at a higher level, comparing bigrams/trigrams would be less compute not more because there’s fewer of them in a given text
I'm correct on the technical level as well: https://chatgpt.com/s/t_698293481e308191838b4131c1b605f1
That math is for comparing all n-grams for all n <= N simultaneously, which isn't what was being discussed.
For any fixed n-gram size, the complexity is still O(N^2), same as standard attention.
I was talking about all n-gram comparisons.
4 replies →