Comment by deepdarkforest
1 day ago
I'm working on Flavia, an ultra-low latency voice AI data analyst that can join your meetings. You can throw in data(csv's, postgres db's, bigquery, posthog analytics for now) and you just talk and ask questions. Using cerebras(2000 tokens per second) and very low latency sandboxes on the fly, you can get back charts/tables/analysis in under 1 second. (excluding time of the actual SQL query if you are doing bigquery).
She can also join your google meet or teams meetings, share her screen and then everyone in the meeting can ask questions and see live results. Currently being used by product managers and executives for mainly analytics and data science use cases.
We plan to open-source it soon if there is demand. Very fast voice+actions is the future imo
This sounds amazing. A demo video would help me finish sign up - I can’t try it without hooking it up to real data, and I don’t want to for a test.
Great feedback thanks! We have added a synthetic e-commerce dataset as an example when you sign up so you can test it without your data first. Will also add a demo video ASAP.
What kind of plan do you have with Cerebras? It seems like something like that would need one of the $1500/month plans at least if there were more than a handful of customers.
They introduced pay as you go recently. The limits on that is similar to the plans, 1 million tokens per minute, so if you stack a few keys and do a simple load balancing with redis, can cover a decent amount of traffic with no upfront cost. Eventually we would have to go enterprise though yes!
ok.. when I tried to use pay-as-you-go it was unusable for me because there were a ton of 429s and 503s. one test it was just constant for a few seconds when I tried it, 429 or 503.
I am using it for a voice application though so retrying causes a delay for the user that they don't expect. especially if it stays unavailable for a few seconds.