Comment by timhigins

9 months ago

> LLM could hallucinate

The job of any context retrieval system is to retrieve the relevant info for the task so the LLM doesn't hallucinate. Maybe build a benchmark based on less-known external libraries with test cases that can check the output is correct (or with a mocking layer to know that the LLM-generated code calls roughly the correct functions).

1 comment

timhigins

marv1nnnnn 9 months ago

Thanks for the feedback. This will be my next step. Personally I feel it's hard to design those test cases (by myself)