Comment by quectophoton

3 months ago

My Occam's Razor guess: There might be some processing being done before the input is passed to the LLM, and some processing before the response is sent back to the user.

Something like a first pass on the input to detect language or format, and try to do some adjustments based on that. I wouldn't be surprised if there's a hex or base64 detection and decoding pass being done as pre-processing, and maybe this would trigger a similar post-processing step.

And if this is the case, the censorship could be running at a step too late to be useful.