Comment by y33t
1 year ago
My favorite example was a game of pong with the goal of staying alive as long as possible. One ML algo just paused the game and left it like that.
1 year ago
My favorite example was a game of pong with the goal of staying alive as long as possible. One ML algo just paused the game and left it like that.
My favorite was the ML learning how to optimally make the lowest-impact landing in a flight simulator— it discovered that it could wrap the impact float value if the impact was high enough so instead of figuring out the optimal landing, it started figuring out the optimal path to the highest-impact crashes.
This comment ought to be higher up. Such a perfect summary of what I have struggled to understand, which is the “danger” of AI once we allow it to control things
And yes you can fix the bug but the bike wheel guy shows you there will always be another bug. We need a paper/proof that invents a process that can put an AI-supported (non human intervention) finite cap or limiter or something on the possible bug surface
There is an apocryphal story about AI:
Conglomerate developed an AI and vision system that you could hook up to your Anti-aircraft systems to eliminate any chance of friendly fire. DARPA and the Pentagon went wild, pushing the system through test so they could get to the live demonstration.
They hook up a live and load up dummy rounds system, fly a few friendly planes over and everything looks good however when they fly a captured Mig-21 over the system fails to respond. The Brass is upset and the engineers are all scratching their heads trying to figure out what is going on but as the sun sets the system lights up, trying to shoot down anything in the sky.
They quickly shut down the system and do a postmortem, in the review they find that all the training data for friendly planes are perfect weather, blue sky overflights and all the training data for the enemy are nighttime/ low light pictures. The AI determined that anything fling during the day is friendly and anything at night is terminate with extreme prejudiced.
4 replies →
Is AI the danger, or is our inability to simplify a problem down to an objective function the problem?
If anything, AI could help by "understanding" the real objective, so we don't have to code these simplified goals that ML models end up gaming no?
6 replies →
Humans have the same vulnerability
https://en.wikipedia.org/wiki/Perverse_incentive
All these claims are like "programming is impossible because I typed in a program and it had a bug". Yes, everyone's first attempt at a reward function is hackable. So you have to tighten up the reward function to exclude solutions you don't want.
What claim would that be? It’s a hilarious, factual example of unintended consequences in model training. Of course they fixed the freaking bug in about two seconds.
Ummm, I'm going to hold off on that FSD subscription for a bit longer...
Is that Learnfun/Playfun that tom7 made? That one paused just before losing on tetris and left it like that, because any other input would make it lose
No I want to say this was ~10 years ago. Happened to a university researcher IIRC.