Comment by arcb
2 days ago
I suspect stronger edge performance will come as a side-effect of local inference. Your point on edge tool calls is interesting and I'll think about that. Features like offline mode could be a great motivating reason. Re knowing the shape vs not the internals - I'm mixed here. It feels like there's always a sampling period where you have to look at contents in order to understand what you want. But edge AI (like antirez's work running DeepSeek on Mac) will let you have both. I'm excited for that future!
Why would an LLM want to look into the contents, what for?
We have low-cardinality data and yes this is safe to share and required to build an actual query.
Then we have high-cardinality and possibly PII - there’s absolutely no reason to share that data, there’s nothing for LLM to analyse there. Also semantic index (vector search) will find relevant records much faster and more accurately that any chain-of-thoughts just with an LLM-authored search fn call.
Further there are continuous numerical values and there’s not much LLM needs to see in there either. We can say, for example, if you look at data distributions when building your analysis, it can drive your analysis logic, but another point of view here is taht it creates unnecessary bias instead.
On re-read I think I might have overreached in my reply. I think having local LLMs being able run tool loops to _transform_ data, rather than just summary or analysis, will become 1/ great for non-technical users, 2/ fast.