Comment by gavmor

19 hours ago

Take a look at https://chatjimmy.ai/ -- it's running against Taalas' "hardcore" silicon model, ie a dedicated, ASIC-like chip.

2 comments

gavmor

Wow - actually pretty astonishing how fast their inference is. So fast it feels fake?

qingcharles 13 hours ago

Yeah, when you find fast inference like that it almost feels like the answer arrives before you hit return. Now imagine it running locally with no server round-trip.