175K+ publicly-exposed Ollama AI instances discovered

5 hours ago (techradar.com)

I tried a few these ... they are pretty slow. If you are looking for free inference you'd have to be pretty desperate.

Many appear to be proxies. I'm familiar with some "serverless" architectures that do things like this https://www.shodan.io/host/34.255.41.58 ... you can see this has a bunch of ollama ports running really really old versions

You can pull down "new" manifests but very few ollamas are new enough for decent modern models like glm-4.7-flash

But if codellama:13b is your jam, go wild I guess.

I'm not sure the "journos" from Techradar are too familiar with how networks ... work.

IPv4 requires an inbound NAT these days to work at all globally, unless you actually have a machine with a globally routable IP. There will probably be a default deny firewall rule too. I do remember the days before NAT ...

IPv6 doesn't require NAT (but prefix translation is available and so is ULA) but again a default deny is likely in force.

You do actually have to try quite hard to expose something to the internets. I know this because I do a lot of it.

The entire article is just a load of buzz words and basically bollocks. Yes it is possible to expose a system on the internet but it is unlikely that you do it by accident. If I was Sead, I'd go easy on the AI generated cobblers and get a real job.

This is a weakness of docker, a bit, I think.

I was rigging this up, myself, and conciscious of the fact that basic docker is "all or none" for container port forwarding because it's for presenting network services, had to dig around with iptables so it'd be similar to binding on localhost.

The use case https://github.com/meltyness/tax-pal

The ollama container is fairly easy to deploy, and supports GPU inference through container toolkit. I'd imagine many of these are docker containers.

e: i stand corrected, apparently -p of `docker run` can have a binding interface stipulated

e2: https://docs.docker.com/engine/containers/run/#exposed-ports which is not in some docs

e3: but it's in the man page ofc

  • Out of curiosity, why would you need to wrap the call to an Ollama modelfile in docker? Does the dockerized ollama client provide some benefit, when it’s shelling down to local Ollama instance anyway? (Wrt tax-pal)

    • It's more of a distribution thing for me really. I'm basically using docker as a package manager since they otherwise distribute through one of those ad-hoc shell scripts that I'd prefer to avoid accidentally breaking Debian with somehow.

      I've built ollama before too, but, I like that I can cleanly rip it out of my system or upgrade it without handing root off to some shell script somewhere I guess.

      If anyone's gonna bash up my system it oughta be me

The tool-calling thing here is overblown.

When you do "tool calling" with an LLM, all you're doing is having the LLM generate output in a particular format you can parse out of the response; it's then your code's responsibility to run the tools (locally) and stick the results back into the conversation.

So that _specific_ part isn't RCE. It's still bad for the nine million other obvious reasons though.

I see this happen all the time when people just want their new toys to work right away. They copy and paste commands from the internet to open up the connection but they forget to put a lock on the door. It is dangerous that so many people run these programs without understanding the basics of how networks work.

  • Even better.

    We have those who are openly admitting that they have never written / read a line of code and have no idea on what it does and using AI to deploy "AI tools" without knowing how to secure them.

    Infosec experts are going to have a great time with collecting lots of money out of this.

This is a combination problem poor default (listening to all interfaces?) and also IPv6 can be publicly accessible. It's a bit dependent on how this is configured upstream by default, but this is a gotcha compared to IPv4.

Fun fact! On macOS you can expose privileged ports (<1024) using a regular user account.

But ONLY if you don't bind the listening port to any interface. So you try to create a listening port on localhost (e.g. 127.0.0.1:443) under a non-root account you get a permission error.

Edit: the thing is, you CAN expose "0.0.0.0:443" without root privileges!