Comment by gaeld

4 hours ago

I guessed you thought about consumer GPUs. We are about standard datacenter GPUs indeed.

Sorry for the confusion

Do you think maybe changing your articles title from "Real-time LLM Inference on Standard GPUs" to "Real-time LLM Inference on Standard Datacenter GPUs" might make sense here? Given more people seem confused by the title than not, and you could clear this up relatively easily, at least on your website although might be late to fix the HN title.

  • YES - I just updated the title of our article according to your suggestion.