Comment by lelandfe

4 years ago

I may have figured it out. The site is committing hijinks with the text. They're manually wrapping text with `<br>`'s and then manually wrapping the source with spaces. Here's the HTML of the lines in question:

    <DIV>The voice was deep and melodious when it spoke. &#8220;David, we have been <BR>expecting you - this is what you have
                               been searching for - this place, <BR>David, is where dreams are born.&#8221; It was at this moment David realized <BR>the
                               being was speaking to him with its own voice, not by thought. David <BR>stood unmoving. He realized he had never dreamed before
                               or even had ever <BR>slept.

If you search for same-line sentence fragments you'll find the page: https://www.google.com/search?q=%22The+voice+was+deep+and+me.... Not an excuse: this is a case Google should handle.

For posterity: https://imgur.com/a/DAUpLit

When every site was full of <br>s and &nbsp;s back in the day, Google had not been at all confused by it.

  • Are we sure about that? My recollection is the same, but it would be nice to have some way of ensuring my memory isn't faulty...

    • Just to remind you of how things were when Google first launched (1996): W3C just started with the recommendation of CSS level 1 (https://www.w3.org/Press/CSS1-REC-PR.html), people were using dl, dt, ul, li and blockquote elements for "styling" (layouting really) websites, Internet Explorer 1.0 was launched the year before and most people who wrote HTML documents were amateurs at best. It's a 100% bet that the markup of yore was messed up compared to todays "standards".