Comment by curl-up

4 hours ago

Prompt they use in `Figure 28.` is a complete mess, all the way from starting it with "Your are an expert" to the highly overlapping categories to the poorly specified JSON without clear direction on how to fill in those fields.

Similar mess with can be found in `Figure 34.`, with an added bonus of "DO NOT MAKE MISTAKES!" and "If you make a mistake you'll be fined $100".

Also, why are all of these research papers always using such weak LLMs to do anything? All of this makes their results very questionable, even if they mostly agree with "common intuition".