Comment by TheTaytay
5 days ago
Looks really cool. In reading through the FAQ, it says this: Q: "How are text features handled?" A: "In the local package version text features are encoded as categoricals without considering their semantic meaning. Our API automatically detects text features and includes their semantic meaning into our prediction. The local package version encodes text as numerical categories and does not include semantic meaning."
So that means that automatic embedding/semantic meaning is reserved for API use of TabPFN, right? Otherwise, if I use it locally, it's going to assign each of my distinct text values an arbitrary int, right?
Yes exactly, the API is the best way to handle text features. The actual semantics often matter a lot . Is the API an option for you or would you need this local?