Comment by jgalt212 8 months ago One would think if AI can generate the slop it could also triage the slop. 4 comments jgalt212 Reply err4nt 8 months ago How does it know the difference? scubbo 8 months ago I'm still on the AI-skeptic side of the spectrum (though shifting more towards "it has some useful applications"), but, I think the easy answer is - if different models/prompts are used in generation than in quality-/correctness-checking. beng-nl 8 months ago This might not always work, but whenever possible, a working exploit could be demanded, working in a form that can be automatically verified to work. jgalt212 8 months ago I think Claude, given enough time to mull it over, could probably come up with some sort of bug severity score.
err4nt 8 months ago How does it know the difference? scubbo 8 months ago I'm still on the AI-skeptic side of the spectrum (though shifting more towards "it has some useful applications"), but, I think the easy answer is - if different models/prompts are used in generation than in quality-/correctness-checking. beng-nl 8 months ago This might not always work, but whenever possible, a working exploit could be demanded, working in a form that can be automatically verified to work. jgalt212 8 months ago I think Claude, given enough time to mull it over, could probably come up with some sort of bug severity score.
scubbo 8 months ago I'm still on the AI-skeptic side of the spectrum (though shifting more towards "it has some useful applications"), but, I think the easy answer is - if different models/prompts are used in generation than in quality-/correctness-checking.
beng-nl 8 months ago This might not always work, but whenever possible, a working exploit could be demanded, working in a form that can be automatically verified to work.
jgalt212 8 months ago I think Claude, given enough time to mull it over, could probably come up with some sort of bug severity score.
How does it know the difference?
I'm still on the AI-skeptic side of the spectrum (though shifting more towards "it has some useful applications"), but, I think the easy answer is - if different models/prompts are used in generation than in quality-/correctness-checking.
This might not always work, but whenever possible, a working exploit could be demanded, working in a form that can be automatically verified to work.
I think Claude, given enough time to mull it over, could probably come up with some sort of bug severity score.