Comment by storyweaver2

8 days ago

Did you compare the performance with o1 or Claude 3.5 Sonnet?

Author here! The fundamental challenge is that LLMs like O1 and Claude 3.5 simply aren't built for the unique structures of tabular data. When processing tables through LLMs, the inefficiencies quickly become apparent - tokenizing a 10,000 x 100 table as a sequence and numerical values as tokens creates massive inefficiencies.

There's some interesting work on using LLMs for tabular data (TabLLM: https://proceedings.mlr.press/v206/hegselmann23a.html), but this only works for datasets with tens of samples rather than the thousands of rows needed in real-world applications.

What o1 and other LLMs typically do is wrap around existing tabular tools like XGBoost or scikit-learn. While this works, they're ultimately constrained by these tools' limitations. We're taking a fundamentally different approach - building foundation models that natively understand tabular relationships and patterns. Our approach combines the benefits of foundation models with architectures specifically designed for tabular data structures.