Comment by wolvoleo

1 month ago

I have a hybrid model here. For many many tasks a local 12b or similar works totally fine. For the rest I use cloud, those things tend to be less privacy sensitive anyway.

Like when someone sends me a message, I made something that categorises it for urgency. If I'd use cloud it means they get a copy of all those messages. But locally there's no issue and complexity wise it's pretty low for an LLM.

Things like research jobs I do do in cloud, but they don't really contain any personal content, they just research using sources they already have access to anyway. Same with programming, there's nothing really sensitive in there.

2 comments

wolvoleo

jrm4 1 month ago

Nice. You're exactly nailing what I'm working towards already. I'm programming with gemini for now and have no problem there, but the home use case I found for local Ollama was "taking a billion old bookmarks and tagging them." Am looking forward to pointing ollama at more personal stuff.

wolvoleo 1 month ago

Yeah I have two servers now. One with a big AMD for decent LLM performance. And one for a smaller Nvidia that runs mostly Whisper and some small models for side tasks.