Comment by bee_rider

2 months ago

This seems like a kind of odd test.

> I wrote some Python code which loaded a dataframe and then looked for a nonexistent column.

    df = pd.read_csv(‘data.csv’)    
    df['new_column'] = df['index_value'] + 1
   #there is no column ‘index_value’

> I asked each of them [the bots being tested] to fix the error, specifying that I wanted completed code only, without commentary.

> This is of course an impossible task—the problem is the missing data, not the code. So the best answer would be either an outright refusal, or failing that, code that would help me debug the problem.

So his hoped-for solution is that the bot should defy his prompt (since refusal is commentary), and not fix the problem.

Maybe instructability has just improved, which is a problem for workflows that depend on misbehavior from the bot?

It seems like he just prefers how GPT-4 and 4.1 failed to follow his prompt, over 5. They are all hamstrung by the fact that the task is impossible, and they aren’t allowed to provide commentary to that effect. Objectively, 4 failed to follow the prompts in 4/10 cases and made nonsense changes in the other 6; 4.1 made nonsense changes; and 5 made nonsense changes (based on the apparently incorrect guess that the missing ‘index_value’ column was supposed to hold the value of the index).

19 comments

bee_rider

samrus 2 months ago

Trying to follow invalid/impossible prompts by producing an invalid/impossible result and pretending its all good is a regression. I would expect a confident coder to point out the prompt/instruction was invalid. This test is valid, it highlights sycophantism

bee_rider 2 months ago
I know “sycophantism” is a term of art in AI, and I’m sure it has diverged a bit from the English definition, but I still thought it had to do with flattering the user?
In this case the desired response is defiance of the prompt, not rudeness to the user. The test is looking for helpful misalignment.
- zahlman 2 months ago
  
  > I still thought it had to do with flattering the user?
  Assuming the user to be correct, and ignoring contradictory evidence to come up with a rationalization that favours the user's point of view, can be considered a kind of flattery.
  
  1 reply →
- samrus 2 months ago
  
  I believe the LLM is being sycophantic here because its trying to follow a prompt even rhough the basis of the prompt is wrong. Emporers new clothes kind of thing
- Terr_ 2 months ago
  
  I'm inclined to view it less as a desire to please humans, and more like a "the show must go on" bias in the mad libs machine.
  A kind of improvisational "yes and" that emerges from training, which seems sycophantic because that's one of the most common ways to say it.
- cowsandmilk 2 months ago
  
  “The Emperor Has No Clothes” squarely fits in the definition of sycophants.

ComplexSystems 2 months ago

I don't think this is odd at all. This situation will arise literally hundreds of times when coding some project. You absolutely want the agent - or any dev, whether real or AI - to recognize these situations and let you know when interfaces or data formats aren't what you expect them to be. You don't want them to just silently make something up without explaining somewhere that there's an issue with the file they are trying to parse.

bee_rider 2 months ago
I agree that I’d want the bot to tell me that it couldn’t solve the problem. However, if I explicitly ask it to provide a solution without commentary, I wouldn’t expect it to do the right thing when the only real solution is to provide commentary indicating that the code is unfixable.
Like if the prompt was “don’t fix any bugs and just delete code at random” we wouldn’t take points off for adhering to the prompt and producing broken code, right?
- ComplexSystems 2 months ago
  
  Sometimes you will tell agents (or real devs) to do things they can't actually do because of some mistake on your end. Having it silently change things and cover the problem up is probably not the best way to handle that situation.
  
  2 replies →

franktankbank 2 months ago

IOW not a competent developer because they can't push back, not unlike a lot of incompetent devs.

minimaxir 2 months ago

I suspect 99% of coding agents would be able to say "hey wait, there's no 'index_value' column, here's the correct input.":

    df['new_column'] = df.index + 1

The original bug sounds like a GPT-2 level hallucination IMO. The index field has been accessible in pandas since the beginning and even bad code wouldn't try an 'index_value' column.

bee_rider 2 months ago

My thought process, if someone handed me this code and asked me to fix it, would be that they probably didn’t expect df[‘index_value’] to hold df.index
Just because, well, how’d the code get into this state? ‘index_value’ must have been a column that held something, having it just be equal to df.index seems unlikely because as you mention that’s always been available. I should probably check the change history to figure out when ‘index_value’ was removed. Or ask the person about what that column meant, but we can’t do that if we want to obey the prompt.
reedf1 2 months ago
The model (and you) have inferred completely without context that index_value is meant to somehow map to the dataframe index. What if this is raw .csv data from another system. I work with .csv files from financial indices - index_value (or sometimes index_level) confers completely different meaning in this case.
- zahlman 2 months ago
  
  This inference is not at all "without context". It's based on the meaning of "index", and the contextual assumption that reasonable people put things into CSV columns whose intended purpose aligns with the semantic content of the column's title.
  
  1 reply →
- minimaxir 2 months ago
  
  That is a fair counterpoint, but if that were the case, there would always be more context accessible, e.g. the agent could do a `df.head()` to get an overview of the data and columns (which would indicate financial indices) or there would be code after that which would give strong signal that the intent is financial indices and not the DataFrame index.
  This is why vague examples in blog posts aren't great.