Comment by magicalhippo

1 year ago

Just for illustration, here's a concrete example. As noted I'm a total n00b and just playing around, but I think it's illustrative, including the limitations of these LLMs.

First I used a Norwegian-tuned Whisper large model to convert to convert some audio. The audio was intentionally unfocused, to simulate a real session. It outputs English so does the translation directly as well, hence the somewhat weird sentences and use of "cake shape" rather than "cake pan". The output from Whisper was as follows:

OK. Yes, think I have a cake shape. I have a recipe for a cake shape ... Wait a minute. So, I have a recipe meant for a cake shape of 30 cm. I will use a cake shape of 24 cm. How much less do I have to do the recipe to ... That it should fit.

For the following I use Gemma 9B as mentioned.

First I pass it through a cleanup step:

Input> The following statement is an audio transcript of a question or statement. It may contain errors, corrections and unrelated utterances. Rewrite it so that it is suitable for an email or text message. Remove superflous or unrelated parts. Do not extrapolate beyond what is in the transcript. The transcript is: [Whisper output]

Output> I have a cake recipe for a 30cm cake shape. I want to use a 24cm cake shape. How much less do I need to make the recipe so it fits?

Next I pass it through a classification step:

Input> Try to categorize the following question in the following categories: knowledge question, math calculation or equation, information about public transport, information about weather, unknown. Output only the category. The question is: [cleaned output]

Output> math calculation or equation

For math questions I pass it to the following prompt:

Input> Convert the following math question into octave/matlab syntax which calculates the answer if possible. The output variable should be x. Only output the code. If the question is unclear or it's not possible to calculate, output three exclamation points followed by a brief explanation. If you need additional information, include that in the explanation. The question is: [cleaned output]

Output> !!! The question requires calculating the volume ratio between two cake shapes, which is not possible with only the provided dimensions.

So we hit a limitation. For people it's obvious two cake pans would have similar height, and thus the answer is calculable. However we didn't tell it that it could make assumptions, so lets try that:

Input> Convert the following math question into octave/matlab syntax which calculates the answer if possible. The output variable should be x. Only output the code. You can make some conservative assumptions, but if the question is unclear or it's not possible to calculate, output three exclamation points followed by a brief explanation. If you need additional information, include that in the explanation. If you make some assumptions, include that as comments at the start of the code. The question is: [cleaned output]

Output>

    % Assuming the cake recipe volume is proportional to the cake shape volume
    r1 = 30/2; % Radius of the 30cm cake
    r2 = 24/2; % Radius of the 24cm cake
    h = 10; % Height of the cakes
    vol1 = pi \* r1^2 \* h; % Volume of the 30cm cake
    vol2 = pi \* r2^2 \* h; % Volume of the 24cm cake
    x = (vol1 - vol2) / vol1; % Ratio of reduction needed

So we got a satisfactory answer. However, here again you see limitations. It doesn't truly understand math. There's no need to assume an explicit height, as the heights cancel out in the final calculation.

Ok, let's try that last prompt on a different and ambiguous question:

What's the cutoff frequency for an LC circuit with R equals 500 ohm and C equals 10 nanofarrad?

Output>

    ```octave
    % Assuming L is not given, we cannot calculate the cutoff frequency.
    % The cutoff frequency formula requires the inductance (L) value.
    !!! Need the inductance (L) value to calculate the cutoff frequency.
    ```

First, it correctly determined the prompt was ambiguous to the point of not being answerable. However it also suddenly included the Markdown which I didn't tell it to do, and which it didn't in the previous case. So maybe the prompt needs to be even more explicit...