Comment by d3m0t3p
1 year ago
You might want to check pola.rs then, it's backed by the appache arrow memory models and it's written in rust. All the columns have a defined type and you can easily catch a mistake when loading data
1 year ago
You might want to check pola.rs then, it's backed by the appache arrow memory models and it's written in rust. All the columns have a defined type and you can easily catch a mistake when loading data
Unless I'm misunderstanding, Arrow solves the data representation on disk/memory, both for pandas and polars, while I'm writing about type inferencing during static analysis, which Arrow doesn't solve.
Having a type checking system respect arrow schemas is indeed our ideal. Will polars during mypy static type checking invocations catch something like `df.this_col_is_missing` as an error? If so, that's what we want, that's great!
FWIW, we donated some of the first versions of what became apache arrow ;-)
I've been hunting down column level typing for a while and did not realise polars had this! That's an absolute game changer, especially if it could cover things like nullability, uniqueness etc.
It's not static, it's basically the same as pandas. Your editor will not know the type of a given column or whether it even exists; all of that happens at runtime.
do you have a reference for how to use static typing for polars columns? I haven't seen this in their docs...