← Back to context

Comment by dolebirchwood

7 hours ago

I have a project where I'm using LLMs to parse data from PDFs with a very complicated tabular layout. I've been using the latest Gemini models (flash and pro) for their strong visual reasoning, and they've generally been doing a really good job at it.

My prompt states that their job is to extract the text exactly as it appears in the PDF. One data point to be extracted is the race of each person listed. In one case, someone's race was "Indian". Gemini decided to extract it as "Native American". So ridiculous.

I was attempting to help someone who runs a small shop selling restored clothing set up a gemini pipeline that would restage images she took of clothing items with bad lighting, backgrounds, etc.

Basically anything that showed any “skin” on a mannequin it would refuse to interact with. Even just a top, unless she put pants on the mannequin.

It was infuriating.