Comment by elliotto
2 days ago
In my experience the api call is trivial compared to the time taken for the LLM to compose the response.
2 days ago
In my experience the api call is trivial compared to the time taken for the LLM to compose the response.
gemini flash and groq are pretty fast, and that part is streamable. curiosity got the best of me so i had claude code write a quick test. given this test is simply is 20 requests, with 1 second delay between requests ran once. so take with a grain of salt, but interesting still. Extra half second in a search is super noticeable so google looking like a reasonable improvement.