← Back to context

Comment by slg

21 days ago

It seems very strange to define these terms based off the difficulty in reproducing them.

Let's look at the sibling comment's example of a nuclear bomb. That's "not simple for anyone to reproduce without significant access" and as citizens we don't "have a say in the security practices used to safeguard it." And international laws have done a relatively good job keeping them out of the hands of bad actors. Does that make them a dataset?

Contrast that with data that is easy to reproduce, like say the name of the 45 different Presidents of the US. That is obviously a dataset. Yet there is no private information involved, it is all public data. Many people can even produce that list entirely from memory. But having that list on a piece of paper in front of me could still be a helpful tool if I was taking a US history test.