Comment by 0cf8612b2e1e
7 hours ago
Under the Known Limitations section
deleted and dead are integers. They are stored as 0/1 rather than booleans.
Is there a technical reason to do this? You have the type right there.
7 hours ago
Under the Known Limitations section
deleted and dead are integers. They are stored as 0/1 rather than booleans.
Is there a technical reason to do this? You have the type right there.
By "to do this" do you mean to not use booleans? It's because the value does not represent a binary true or false but rather a means by which the item is deleted or dead. So not only would it not make sense semantically, it would break if a third means were introduced.
> It's because the value does not represent a binary true or false but rather a means by which the item is deleted or dead.
"Deleted" and "dead" are separate columns.
> So not only would it not make sense semantically, it would break if a third means were introduced.
If that was the intention, it would seem like a bad design decision to me. And actually what you assume to be the reasoning, is exactly what should be avoided. Which makes it a bad thing.
This is a limitation not because of having the bool value be represented by an int (or rather "be presented as"), but because of the t y p e , being an integer.
Funny, because the HackerNews API [0] does return booleans for those fields. That is, a state, not a type of deletion or death.
[0] https://github.com/HackerNews/API
The API documents this but from a spot check I'm not sure when you'd get a response with deleted: false. For non-deleted items the deleted: key is simply absent (null). I suppose the data model can assume this is a not-null field with a default value of false but that doesn't feel right to me. I might handle that case in cleaning but I wouldn't do it in the extract.
3 replies →