Comment by williamcotton

2 years ago

I don't think "lossy text" is a useful term because it conflates with th*s k*nd *f l*ss* t*xt as well. Lossy compression is designed to be as reversible as it can be to a given threshold. That's not how ChatGPT was either designed or works in practice. There are definitely a lot of mathematical similarities between the two, I won't deny that.

Would "partial knowledge compression" be a better term? Partial knowledge of both English and French is a requirement to reliably translate from English to French. Partial knowledge of both baseball box scores and entertaining paragraph outlines in English is a requirement to reliably translate from a box score into an entertaining outline, right?

To me, "lossy compression" vs "Partial knowledge compression" sounds like six vs a half-dozen. Whatever you call it, I think the author was writing more about how we perceive the results generated from a language-compression model vs an image compression model.

  • The reason why the author chose the term lossy compression was to make it seem like ChatGPT was nothing but a thing that makes things blurry. Do you see a single mention of ChatGPT being a reliable translator in that article or any sort of distinction made between the different kind of tasks that the model is used for?

    So it is nothing like six vs a half dozen, because those mean the same thing, and lossy compression is a bad description of half of what ChatGPT does, which makes it a bad description and not at all equal to another more thoughtful, less emotional, description.

    • I think we're talking about different things, but that said, call it whatever you like - lossy, partial, sub-sampling ...