Comment by mkwarman
2 years ago
This is an interesting explanation, but wouldn't `I` influence `T` rather than the other way around? Since the type of ice cream determines the amount of time taken in the store.
2 years ago
This is an interesting explanation, but wouldn't `I` influence `T` rather than the other way around? Since the type of ice cream determines the amount of time taken in the store.
My comment did two things (but they were somewhat muddled). It: (a) laid a particular model; and (b) offered {explanations/claims} of causality. But unfortunately it said nothing about (c) experimental design.
I'll start with (c). Attempting to talk about a model in isolation from its experimental design can be misleading, as it ignores the context that gives the model its interpretive power and validity. In this case, a good experimental design must include a sufficiently diverse sample of people to account for variation.
Regarding (b), depending on the person, the influence could flow either way between `I` and `T`, to varying degrees.
- Example of `I->T`: One person might come into the store strongly preferring one type of ice cream (`I`) and be willing to take time to look for it (`T`)
- Example of `T->I`: Another person might come into the store in a hurry and be motivated to procure the closest ice cream flavor.
Regarding (a), no model is 'true' but some are better than others for particular purposes.
- To the extent that prediction is the key goal, confounding variables don't usually matter.
- But to the extent that _statistical inference_ is the key goal, there are many techniques for teasing apart influence.
Unfortunately, too often in machine learning contexts, the word "inference" refers to the process of using a trained model for _prediction_. Yikes. This contrasts sharply with the term's use in statistics. The field of statistics got this one right, even as ML techniques have taken off spectacularly.