Comment by m3kw9 5 months ago You don’t really feed images to LLMs, rather to a vision model within the multi modal llm 1 comment m3kw9 Reply ritvikpandey21 5 months ago yup, important clarification! the language portion of the model also works with the extraction however, and is prone to the hallucinations
ritvikpandey21 5 months ago yup, important clarification! the language portion of the model also works with the extraction however, and is prone to the hallucinations
yup, important clarification! the language portion of the model also works with the extraction however, and is prone to the hallucinations