Comment by rurban
1 day ago
Of course not. Users love the chatbot. It's fast and easier to use than manually searching for answers or sticking together reports and graphs.
There is no latency, because the inference is done locally. On a server at the customer with a big GPU
> There is no latency
Every chat bot I was ever forced to use has built-in latency, together with animated … to simulate a real user typing. It’s the worst of all worlds.
> to simulate a real user typing
The models return a realtime stream of tokens.
This was already the case before LLMs became a thing. This is still the case for no-intelligence step by step bots.
Because they are all using some cloud service and external LLM for that. We not.
We sell our users a strong server, where he has all his data and all his services. The LLM is local, and trained by us.