Comment by corysama

3 months ago

From the paper I saw that the model includes an approximation of the layout, diagrams and other images of the source documents.

Now imagine growing up only allowed to read books and the internet through a browser with CSS, images and JavaScript disabled. You’d be missing out on a lot of context and side-channel information.