Comment by swaminarayan
8 hours ago
How are you doing semantic end-of-turn detection without adding latency to the critical path? Is it a separate lightweight model or integrated into the LLM stream?
8 hours ago
How are you doing semantic end-of-turn detection without adding latency to the critical path? Is it a separate lightweight model or integrated into the LLM stream?
No comments yet
Contribute on Hacker News ↗