Comment by gavmor

18 hours ago

Take a look at https://chatjimmy.ai/ -- it's running against Taalas' "hardcore" silicon model, ie a dedicated, ASIC-like chip.

2 comments

gavmor

Wow - actually pretty astonishing how fast their inference is. So fast it feels fake?

qingcharles 11 hours ago

Yeah, when you find fast inference like that it almost feels like the answer arrives before you hit return. Now imagine it running locally with no server round-trip.