Comment by nubinetwork 2 years ago Surprise surprise... bytespider is at the top of the list. 8 comments nubinetwork Reply bluetidepro 2 years ago (Being lazy not Googling) What is bytespider and why “surprise surprise”? nuz 2 years ago This is tiktoks scraper? How come tiktok does mass scraping of websites? jsheard 2 years ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao. CharlieDigital 2 years ago The most straightforward is training data for an LLM. heywire 2 years ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
nuz 2 years ago This is tiktoks scraper? How come tiktok does mass scraping of websites? jsheard 2 years ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao. CharlieDigital 2 years ago The most straightforward is training data for an LLM. heywire 2 years ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
jsheard 2 years ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao.
CharlieDigital 2 years ago The most straightforward is training data for an LLM. heywire 2 years ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
heywire 2 years ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
(Being lazy not Googling) What is bytespider and why “surprise surprise”?
This is tiktoks scraper? How come tiktok does mass scraping of websites?
ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao.
The most straightforward is training data for an LLM.
Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though.
3 replies →