Comment by esseph

1 month ago

Hmmm

I have no idea how this would work just brainstorming.

Could you.. use some browser backend to render the page to a PDF, then an LLM to scrape the content and display it as text?

I know it wouldn't be exactly efficient, but...

5 comments

esseph

worksonmine 1 month ago

A more pragmatic approach would be to run the content through something like readability[0] but leaves navigation untouched. The AI could hallucinate and add content that isn't in the original, something accessibility tools don't.

[0]: https://github.com/mozilla/readability

ploum 1 month ago
This is exactly what Offpunk is doing: displaying the html page after it passed throught Readability.
https://offpunk.net
The whole page is still available with "view full" (or "v full")
In the current trunk, if configured, it uses ftr-site-config rules to extract content for specific websites ( https://github.com/fivefilters/ftr-site-config )
I do 90% of my browsing using Offpunk (reading blogs and articles) and, suprizingly, it often works better than a graphical browser (no ads, no popup, no paywall). Of course, it doesn’t work when you really needs JS.
- anthk 1 month ago
  
  Dillo uses something similar with rdrview, you can use rdrview://$URL (altough I hacked the dpi plugin to use the rd:// 'protocol' for shortness).
  It lacks the filter thingy but now has the dilloc tool where it can print the current URL, open a new page... and with sed you can trivially reopen a page with an alternative from https://farside.link
  You know, medium.com -> scribe.rip and the like.
  But Dillo is not a terminal browser, altough it's a really barebones one and thanks to DPI and dilloc it can be really powerful (gopher, gemini, ipfs, man, -info in the future) and so on available as simple plugins, either in sh, C or even Go) and inspiring for both offpunk and w3m (where it has similar capabilities as Dillo to print/mangle URL's and the like).
  What I'd love is to integrate Apertium (or any translating service) with Dillo as a plugin so by just running trans://example.com you could get any page translated inline without running tons of Google propietary JS to achieve the same task.
  I love the https://linux.org.ru forum and often they post interesting setups but I don't speak Russian.

ploum 1 month ago

So you mean that someone use LLM to generate a website full of JS, post a text in it and then we use LLM to try to rebuild the original content?

If only we had a way to just share text without all those steps…

esseph 1 month ago

You're misunderstanding.
You go to site with your text browser. An LLM loads and renders the content in memory and then is helping to convert that to a text only interface for your tui browser to display and navigate.
Apparently other systems are using a similar method.