Comment by kwhitefoot
3 years ago
Could machine learning could help with this? Could some algorithm be created to strip the padding from such articles just leaving the useful stuff?
3 years ago
Could machine learning could help with this? Could some algorithm be created to strip the padding from such articles just leaving the useful stuff?
Yes. See https://smmry.com/https://www.science.org/content/article/sc... for example.
Thank you! Works very well.
I would expect machine learning to be more useful in generating this kind of fluff.
Seems like a lot of work (and a lot of time to get working reliably) for a relatively minor issue that can be mostly fixed by skim reading
No.
If you want this to change, change the source, don't patch cargo cult on top.
You're wrong, it is very well possible with algorithms, and it has been a thing for many years already. For example check the Auto TLDR bot for Reddit. Here's an example of what it could look like for the article posted here: https://smmry.com/https://www.science.org/content/article/sc...
It's not even hard to do, you mostly have to count token frequencies which is hardly machine learning. Although you could also do it with BERT.