← Back to context

Comment by AstralStorm

6 years ago

Submitted data is still data, just at worst biased. Which may or may not be important.

The question is always what bias and whether collecting much less data yourself is preferable. Your non-submission sampling tactic may be biased too. (E.g. telephone questionnaires select for people having free time on demand. Emails select for people with bad spam filters and present in mailing list. Walking to ask has other limitations such as range and again availability. Asking third parties may be biased too, just like asking first parties.)

Usually when there are lots of unique submissions the question of bias or lack of representation can be put to rest.

If e.g. there are racial biases compared to baseline population due to submissions, this can be taken into account. Likewise if there is his due to some school districts responding less or more. You will have to handle these issues anyway.

If you guess what the representative sample might be, you may be committing scientific fraud...