Comment by famouswaffles

3 years ago

No worries. Like I said, that was just a general overview.

Strictly speaking the model doesn't have to be frozen (though unfreezing tends to make the original model perform much worse at NLP tasks) and the task isn't necessarily just image to text (Palm e for example trains to extract semantic information from objects in an image as well)