Comment by ceejayoz 5 days ago Sure. Claude does that. "Cogitated for 1m 50s" doesn't work for real-time applications. 3 comments ceejayoz Reply charcircuit 5 days ago You can submit many queries in parallel to increase throughout. Smaller models and faster hardware can reduce the time per query too. ceejayoz 5 days ago None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses. sebmellen 5 days ago Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.
charcircuit 5 days ago You can submit many queries in parallel to increase throughout. Smaller models and faster hardware can reduce the time per query too. ceejayoz 5 days ago None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses. sebmellen 5 days ago Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.
ceejayoz 5 days ago None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses.
sebmellen 5 days ago Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.
You can submit many queries in parallel to increase throughout. Smaller models and faster hardware can reduce the time per query too.
None of that gets you the 100ms response time the parent poster talked about, for something like "who is at my doorbell?" real-time uses.
Ok. Claude will not work for this use case because none of the sample data (weirdly blurry ID images) is in the training data.