Comment by vitus

1 year ago

Am I misinterpreting the prompt, or did the LLM misinterpret it from the get-go?

    Given a list of 1 million random integers between 1 and 100,000, find the difference between the smallest and the largest numbers whose digits sum up to 30.

That doesn't read to me as "generate a list of 1 million random integers, then find the difference ..." but rather, "write a function that takes a list of integers as input".

That said, my approach to "optimizing" this comes down to "generate the biggest valid number in the range (as many nines as will fit, followed by whatever digit remains, followed by all zeroes), generate the smallest valid number in the range (biggest number with its digits reversed), check that both exist in the list (which should happen With High Probability -- roughly 99.99% of the time), then return the right answer".

With that approach, the bottleneck in the LLM's interpretation is generating random numbers: the original random.randint approach takes almost 300ms, whereas just using a single np.random.randint() call takes about 6-7ms. If I extract the random number generation outside of the function, then my code runs in ~0.8ms.

> That doesn't read to me as "generate a list of 1 million random integers, then find the difference ..." but rather, "write a function that takes a list of integers as input".

This was the intent and it's indeed a common assumption for a coding question job interviews, and notably it's fixed in the prompt-engineered version. I didn't mention it because it may be too much semantics as it doesn't affect the logic/performance, which was the intent of the benchmarking.

I like the idea of your optimization, but it will not work as stated. The largest would be something close to MAXINT, the smallest 3999. With a range of 2 billion over 32 bits, the odds of both these being within a list of a million is quite a bit poorer than 99.9%.

  • The stated inputs are integers between 1 and 100,000, so if you're generating 1 million inputs, then you have 0.99999 ^ 1e6 = 4.5e-5 chance (roughly e^-10) of missing any given number, or roughly double that for missing any pair of values.

    The key observation here is that you're sampling a relatively small space with a much greater number of samples, such that you have very high probability of hitting upon any point in the space.

    Of course, it wouldn't work if you considered the full 32-bit integer space without increasing the number of samples to compensate. And, you'd need to be a little more clever to compute the largest possible value in your range.

    • Indeed. My understanding is that people ask this sort of thing in interviews specifically to see if you notice the implications of restricting the input values to a narrow range.