← Back to context

Comment by timhigins

7 months ago

> LLM could hallucinate

The job of any context retrieval system is to retrieve the relevant info for the task so the LLM doesn't hallucinate. Maybe build a benchmark based on less-known external libraries with test cases that can check the output is correct (or with a mocking layer to know that the LLM-generated code calls roughly the correct functions).

Thanks for the feedback. This will be my next step. Personally I feel it's hard to design those test cases (by myself)