← Back to context

Comment by rtkwe

5 months ago

We already see that at smaller scales though with other machine learning algorithms; image recognition algorithms will lock on to any consistency in your training set more often than learning to recognize what you actually want it to [0]. It's not a huge stretch to map that pattern out to a more generally intelligent system having a poorly defined reward function doing really weird stuff.

[0] Like the tumor recognition algorithm that instead learned to recognize rulers or the triage algorithm that decided asthma patients had BETTER outcomes with pulmonary diseases not making the connection that it's because they get higher priority care - https://venturebeat.com/business/when-ai-flags-the-ruler-not...

I think it is a huge stretch to believe that patterns which appear in one set of algorithms (simple non-AGI algorithms) will also appear in another set (AGI algorithms).

Unless there is some physical reason for the behavior I wouldn't make any strong claims. The specificity of algorithms is why AGI is hard in the first place cause at the end of the day you have a single operation running on a single data structure (helps when it's a few TB).

  • I think the pattern holds even as you increase the intelligence that a machine does not by nature of being able to mimic intelligence come with the same framework of understanding what is requested of it.