← Back to context

Comment by ThinkBeat

17 days ago

Hmm I have been doing a but if this manually lately for a personal project. I am working on some old books that are far past any copyright, but they are not available anywhere on the net. (Being in Norwegian m makes a book a lot more obscure) so I have been working on creating ebooks out of them.

I have a scanner, and some OCR processes I run things through. I am close to 85% from my automatic process.

The pain of going from 85% to 99% though is considerable. (and in my case manual) (well Perl helps)

I went to try this AI on one of the short poem manufscript I have.

I told the prompt I wanted PDF to Markdown, it says sure go ahead give me the pdf. I went upload it. It spent a long time spinning. then a quick messages comes up, something like

"Failed to count tokens"

but it just flashes and goes away.

I guess the PDF is too big? Weird though, its not a lot of pages.

I experienced something similar. My use case is I need to summarize bank statements (sums, averages, etc.). Gemini wouldn't do it, it said too many pages. When I asked the max number of supported pages, it says max is 14 pages. Attempted on both 2.0 flash and 2.0 pro in VertexAI console.

  • Try with https://aistudio.google.com Think the page limit is a vertex thing The only limit in reality is the number of input tokens taken to parse the pdf. If those tokens + tokens for the rest of your prompt are under the context window limit, you're good.

Take a screenshot of the pdf page and give that to the LLM and see if it can be processed.

Your PDF might have some quirks inside which the LLM cannot process.