Comment by RantyDave

12 hours ago

Ahhh, so is this a chip "more optimised" for connecting GPU's to reality ... or are they skipping the GPU step entirely? Are GPU's only for training now?

5 comments

RantyDave

cyanydeez 12 hours ago

have you seen this: https://chatjimmy.ai/

It's quite impressive what purpose build inference can/will do once everyone stops trying to become kind of the best model.

redwood 11 hours ago
Wow impressive. What's the story with this?
- jffry 11 hours ago
  
  It's a tech demonstrator for a company that turns models into custom silicon for fast inference. In this case llama3.1-8b https://taalas.com/products/
  
  1 reply →
- hmartin 11 hours ago
  
  Taalas hardware implementation of Llama 3.1 8B They claim 16k tok/s vs Cerbras at 2k. https://taalas.com/products/