← Back to context

Comment by RobertDeNiro

3 months ago

I think the prompt is probably at fault here. You can use LLMs for object segmentation and they do fairly well, less than 1% seems too low.

The cross-tile challenges were quite robust - every model struggled with them, and we tried with several iterations of the prompt. I'm sure you could improve with specialized systems, but the models out-of-the-box definitely struggle with segmentation