← Back to context Comment by nubinetwork 1 year ago Surprise surprise... bytespider is at the top of the list. 8 comments nubinetwork Reply bluetidepro 1 year ago (Being lazy not Googling) What is bytespider and why “surprise surprise”? nuz 1 year ago This is tiktoks scraper? How come tiktok does mass scraping of websites? jsheard 1 year ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao. CharlieDigital 1 year ago The most straightforward is training data for an LLM. heywire 1 year ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
nuz 1 year ago This is tiktoks scraper? How come tiktok does mass scraping of websites? jsheard 1 year ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao. CharlieDigital 1 year ago The most straightforward is training data for an LLM. heywire 1 year ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
jsheard 1 year ago ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao.
CharlieDigital 1 year ago The most straightforward is training data for an LLM. heywire 1 year ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
heywire 1 year ago Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though. 3 replies →
(Being lazy not Googling) What is bytespider and why “surprise surprise”?
This is tiktoks scraper? How come tiktok does mass scraping of websites?
ByteDance isn't just TikTok. As mentioned in the article, they have their own LLM product called Doubao.
The most straightforward is training data for an LLM.
Kind of makes me want to build a dynamic web server that spews plausible garbage to poison their training set. Probably a bit like peeing in the ocean though.
3 replies →