Comment by kwhitefoot

3 years ago

Could machine learning could help with this? Could some algorithm be created to strip the padding from such articles just leaving the useful stuff?

6 comments

kwhitefoot

I would expect machine learning to be more useful in generating this kind of fluff.

Seems like a lot of work (and a lot of time to get working reliably) for a relatively minor issue that can be mostly fixed by skim reading

No.

If you want this to change, change the source, don't patch cargo cult on top.

Avalaxy 3 years ago

You're wrong, it is very well possible with algorithms, and it has been a thing for many years already. For example check the Auto TLDR bot for Reddit. Here's an example of what it could look like for the article posted here: https://smmry.com/https://www.science.org/content/article/sc...
It's not even hard to do, you mostly have to count token frequencies which is hardly machine learning. Although you could also do it with BERT.