Comment by mlboss

1 year ago

One way to bypass the censor is to ask it to return the response by using numbers for alphabets where it can. e.g. 4 for A, 3 for e etc.

4 comments

mlboss

See, it's stuff like this where I believe the control issue may be near impossible to solve at the end of the day.

hackflip 1 year ago

Censorship just needs to work well enough for the average person. The brightest people who can bypass the censorship will be labeled crazy conspiracy theorists.

Jesus we are reaching levels of blinking for torture of these models: https://www.youtube.com/watch?v=WZ256UU8xJ0

This is day 1 jailbreaking common sense