Comment by AlecSchueler

2 years ago

Easy fix since ChatGPT always apologises for not complying: any description or title containing the word "sorry" gets flagged for human oversight. Still orders of magnitude faster than writing all your own spam texts.

I think it would be better to ask it to wrap the answer with some known marker like START_DESCRIPTION and END_DESCRIPTION. This way if it refuses you'll be able to tell right away.

As another user pointed out, sometimes it doesn't refuse by using the word "sorry".

  • In the same vein, I had a play with asking ChatGPT to `format responses as a JSON object with schema {"desc": "str"}` and it seemed to work pretty well. It gave me refusals in plaintext, and correct answers in well-formed JSON objects.

Except when it doesn't:

https://www.amazon.com/FOPEAS-Language-Context-referring-Inf...

The seller account's entire product list is a stream of scraped images with AI-nglish descriptions slapped on by autopilot. If you can cast thousands of lines for free and you know the ranger isn't looking, you don't need good bait to catch fish.

Sometimes it "apologizes" rather than saying "sorry", you could build a fairly solid heuristic but I'm not sure you can catch every possible phrasing.

OpenAI could presumably add a "did the safety net kick in?" boolean to API responses, and, also presumably, they don't want to do that because it would make it easier to systematically bypass.

  • > OpenAI could presumably add a "did the safety net kick in?" boolean to API responses, and, also presumably, they don't want to do that because it would make it easier to systematically bypass.

    Is a safety net kicking in or is the model just trained to respond with a refusal to certain prompts? I am fairly sure it's usually the latter, and in that case even OpenAI can't be sure a particular response is a refusal or not.

  • Just feed the text to a new ChatGPT conversation and ask it whether the text is an apology or a product description.

    Or do traditional NLP, but letting ChatGPT classify your text is less effort to set up

  • Why not have a separate chat request to apology-check the responses?

    Not my original idea, there was a link from HN where the dev did just that.

    • Sounds like a great way to double your API bills, and maybe that's worth it, but it seems pretty heavy-handed to me (and equally not 100% watertight).

      3 replies →

next up, retailers find out that copies of the board game Sorry! are being autodeclined. The human review that should have caught it was so backlogged that there is a roughly 1/3 chance of it timing out in the queue and the review task being discarded.

https://www.amazon.com/s?k=sorry

  • Hmm someone else suggested this would be an issue, but the overall percentage of products with sorry in their description is very small and having the human operator flag it is a false positive is still, as I say, orders of magnitude faster than wiring your own product descriptions.

    • I mean it works until the default prompt changes to not have "sorry" in it, or spammers add lots of legit products with "sorry" in the description, or some new product comes out that uses "sorry" in it, it then you're just playing cat and mouse.

  • This is exactly what I learned working on Internet scale data. A new dude will walk in, proclaim that a simple rule will solve all your problems.

    • There very often are easy solutions for very niche problems, simply because nobody has bothered with it before.

      I don't see how a search result with 7 pages is supposed to demonstrate that this idea wouldn't work? I'm not saying whether it would be particularly helpful, but a human can review this entire list in a handful of minutes.

I would just make it respond ONLY in JSON and if it's non-compliant formatting then don't use it. I doubt it'd apologize in JSON format. A quick test just now seems to work

  • If you're using the API's JSON mode, it will apologize in JSON. If you prompt asking for JSON not in that mode, it should work like you're thinking.

I’d create an embedding center by averaging a dozen or so apology responses. If the output has an embedding too close to that cluster you can handle the exception appropriately.

Just have a second AI validate the first and tell it that its job is spotting fake products.

  • You joke but this is unreasonably effective. We're prototyping using LLms to extract, among other things, names from arbitrary documents.

    Asking the LLm to read the text and output all the names it found -> it gets the names but there's lots of false positives.

    Asking the LLM to then classify the list of candidate names it found as either name / not name -> damn near perfect.

    Playing around with it it seems that the more text it has to read the worse it performs at following instructions so having low accuracy pass on a lot of text followed by a high-accuracy pass on a much smaller set of data is the way to go.

    • What's your false negative rate? Also, where does it occur,is it the first LLM that omits names, or the second LLM that incorrectly classify words as "not a name" when it is in fact a name?

Why Amazon is not able to actually verify sellers real identities and terminate their accounts? I would imagine that they should be able to force them to supply verifiable national identification/bank account etc. How do these sellers get away with these?

Funny to see people seriously trying to be creative and find solutions to something that shouldn’t be a problem in the first place.

Maybe using machine readable status codes for responses, as everything else does, isn’t such a bad idea after all...

Another fix is to not create product listings for internet points. This product doesnt even show in search results on amazon (or at least didnt when i checked). Op didnt “find” it. They made it. Probably to maintain hype.