← Back to context

Comment by thesz

7 hours ago

  > but I think it's unfair to say that LLMs _cannot_ recognize or generate CFGs.

They recognize and/or generate finite (<800 chars) grammars in that paper.

Usually, sizes of files on a typical Unix workstation follow two-mode log-normal distribution (sum of two log-normal distributions), with heavy tails due to log-normality [1]. Authors of the paper did not attempt to model that distribution.

[1] This was true for my home directories for several years.