Comment by kym6464
16 days ago
RE: the loss of bounding box information
You can recover word-level bounding boxes and confidence scores by using a traditional OCR engine such as AWS Textract and matching the results to Gemini’s output – see https://docless.app for a demo (disclaimer: I am the founder)
No comments yet
Contribute on Hacker News ↗