Comment by kostaj
5 hours ago
Indeed. Real-world claims are somewhat messy. Some of the standard benchmarks, e.g. the questions in AVeriTeC, share similar characteristics.
5 hours ago
Indeed. Real-world claims are somewhat messy. Some of the standard benchmarks, e.g. the questions in AVeriTeC, share similar characteristics.