Comment by bearjaws

14 hours ago

Feel like the canary was when Grokpedia became a project.

Giant waste of time while Anthropic/OAI keep surging forward.

I also keep hearing this narrative that Twitter is a good data source, but I cannot imagine it's a valuable dataset. Sure keeping up with realtime topics can be useful, but I am not sure how much of a product that is.

159 comments

bearjaws

paulbjensen 12 hours ago

The Twitter social graph was an amazing data asset. I worked at a consumer insights firm and the data on followers/followings was quite powerful.

Using a custom taxonomy of things (celebrities, influencers, magazines, brands, tv shows, films, games, all kinds of things), we could identify groups of people who liked certain things, and when you looked at what those things were, it gave you a way of understanding who those people were.

With that data, you could work out:

- What celebrities/influencers to use in marketing campaigns - Where to advertise, and on which tv/radio channels - What potential brands to collaborate with to expand your customer base - What tone of voice to use in your advertising - In some cases, we educated clients about who their actual customers were, better than they understood themselves.

One scenario, we built a social media feed based on the things that a group of customers following a well-known Deodorant brand in the UK would see.

When we presented that to the client, they said “Why are there so many women in bikinis in this feed?”

The brand had repositioned themselves to a male-grooming focussed target market, but had failed to realise that their existing customer base were the ones that had been looking at their TV adverts of women on beaches chasing a man who happened to spray their Deodorant on them. Their advertising from the past had been very effective.

That was the power of Twitter’s data, and it is an absolute shame that Twitter went the way that it did. Mark Zuckerberg once said that Twitter was like “watching a clown car driven into a gold mine”.

I’m pretty sure he must be delighted with how things have panned out since.

BLKNSLVR 11 hours ago

That entire description sounds worthless to any positive direction of humanity. Therefore probably rapaciously profitable
Very sad face.
mbs159 3 hours ago

Damn, this only validates the use of ad-blockers / sponsor-blockers even more
rchaud 9 hours ago

In other words, using flash-in-the-pan data to build an advertising goldmine.
johnisgood 10 hours ago
This reads very dystopian. You are not optimizing to understand people, you are optimizing to weaponize that understanding against them.
When you know what someone will buy based on exploiting their unconscious preferences, and you are paid to increase sales, you will do it. Especially if your competitors are doing it too.
And this happens at scale, invisibly. People never see the manipulation.
In any case, it is not useful for most people. It is useful for the people doing the deceiving.
- caaqil 9 hours ago
  
  The tech is interesting and useful, no need for the scary moral framing.
  The original application of the entire field of data science or ML is/was actually based on this paradigm of finding "unconscious preferences" (your words) and hidden patterns. How one chooses to deploy the tech should be judged on its own.
  On the current trajectory of tool/data abuse where Palantir et al. are leading the way, this is very low on the sinister scale.
  
  2 replies →
- etchalon 10 hours ago
  
  It's marketing. That's how marketing works.
smcin 12 hours ago

That Zuckerberg quote was published in 2013 and supposedly was made a year or more before. Was it about when Dick Costolo was CEO (2010-2012)?
gwern 11 hours ago
It's definitely very valuable, but for what AI model? How does any of that lead to AGI, or even just a good coding agent?
- applfanboysbgon 11 hours ago
  
  It doesn't need to lead to AGI or a good coding agent. Some of the only people who are actually profitable in the LLM industry are the people making actual chatbots. There are several bootstrapped startups that run open-weight models with a $10 or $20 monthly sub and make millions in profit off of inference from people just talking to the things, usually for character roleplay / "AI boyfriend/girlfriend" stuff etc. Some of them even took those profits and invested it into training their own bespoke models from scratch, usually on the smaller side although finetunes/retrains of Llama 70b, GLM, and Deepseek 670b have also been done. Grok could probably be profitable if it targeted this space, as the most "intelligent" conversational/uncensored model.
  This is already presupposing that profit even matters, though. Musk already burned some $50 billion dollars to control messaging on political discourse with his acquisition of Twitter. It was not about money, but power. After you already have infinite money, the only thing left to spend it on is acquiring more power, which is achieved through influencing politics. LLMs represent a potentially even better propaganda tool than social media platforms. They give you unprecedented access to people's thoughts that they would probably not share online otherwise, and they allow you to more subtly influence people with deeply-personalised narratives.
- KaiserPro 11 hours ago
  
  > but for what AI model?
  Sentiment analysis. Working out what words lead to what outcomes, and then being able to predict on new data is super useful.
  For coding or "AGI" no, its not useful. For building a text based (possibly image based) recategorisation system top class.
alex1138 11 hours ago

As an aside that quote from MZ does bother me. There's more to making a web-scale human rights respecting (because it has to, it's the internet, social media needs guidelines) than just making money (which Zuck doesn't seem to care much about anyway if he's sinking apparently billions into metaverse while having no account support)
Of course he would only see it through the lens of cash. I have no idea how profitable Twitter was under Dorsey but it felt the spirit of the company at first was relatively neutral, it was a tool, it was what Jack came up with
Zuck replaced people's email addresses[1], the feed has been wildly unchronological for years. Fix some of those problems wrt. lack of user respect and maybe you can make statements like "all else being equal, clown car goal mine". Or was it "dumb fucks"[2]?
[1] https://news.ycombinator.com/item?id=4151433 [2] https://news.ycombinator.com/item?id=1692122
cyanydeez 12 hours ago
It _was_ a great asset, however, just like models need proper data, as soon as musk removed the clamps on valuable social signals, well, he basically took a dump where he intended to eat.
- ohyoutravel 10 hours ago
  
  They did say was, and did say Twitter, which existed in the past.

brokencode 13 hours ago

It’s pretty telling that Elon had to have Grok rewrite Wikipedia because the truth was too woke for him. No idea how anybody can ever take Grok seriously.

freehorse 13 hours ago
Many projects in his companies seem to be more and more Musk's vanity projects than ideas/products one can take seriously. This is also how tesla ended up with a huge cybertruck stock that nobody wants to buy and thus had to be bought by his other companies. And it is becoming worse and worse, especially ever since he bought twitter and sped up his twitting rates.
- dmarcos 12 hours ago
  
  FWIW it looks there’s now a demand surge with the introduction of the new cheap cybertruck variant. delivery dates pushed out to the fall of 2026.
  
  14 replies →
- scottyah 12 hours ago
  
  [flagged]
- annexrichmond 10 hours ago
  
  Drivel. They’re selling just as well as Rivians.
squarefoot 13 hours ago

Probably next generations of kids being fed PragerU studying material will. Something tells me we didn't see a fraction of what's going to happen in the decades to come.
annexrichmond 10 hours ago
Are really suggesting everything in Wikipedia is truthful, complete, and free of all biases?
- hananova 8 hours ago
  
  Maybe not all of it, but a vast majority of it is. And almost certainly the parts that drove Elon to slopify it are true.
  
  1 reply →
- comicjk 8 hours ago
  
  Not everything on Wikipedia is true, but the parts Elon Musk hates most are probably true.
  
  2 replies →
Timon3 13 hours ago
I take Grokipedia very seriously as a threat to society. Sure, they're happy if people read it and fall for - but the primary goal is not to convince humans, but to influence search results of current models & to poison the training data of future models. ChatGPT (and most likely other models/providers too) is already using Grokipedia as a source, so unless you're aware of the possibility and always careful, you might be served Musks newest culture war ideas without ever being the wiser.
It's not enough that everyone on Twitter is forced to read his thoughts, he's trying to make sure his influence reaches everyone else too.
- danabramov 12 hours ago
  
  I've seen Claude pick it up too. It's disconcerting.
alex1138 13 hours ago
I can both not like Elon and also think Wikipedia is also very captured on some things
- ryandrake 13 hours ago
  
  Are there actual good examples showing errors of fact on Wikipedia that are verifiably incorrect, that demonstrate how it is "captured"?
  
  44 replies →
- freehorse 13 hours ago
  
  I can understand somebody not liking wikipedia, I cannot understand at all somebody, who is not Elon, liking/preferring "grokipedia" as idea or implementation.
  
  13 replies →
- Rover222 10 hours ago
  
  I appreciate you
Rover222 10 hours ago
Wikipedia obviously is left leaning.
- hananova 8 hours ago
  
  Well yes, but so is reality. And Wikipedia as an encyclopedia is supposed to document reality. So what's the problem?
  
  8 replies →
tclancy 13 hours ago

[flagged]

notahacker 13 hours ago

Twitter's communication style being based around brevity, slang, memes, spam and non-threaded conversations seems particularly unlikely to be helpful for optimising LLMs

tclancy 13 hours ago
>Twitter's communication style being based around brevity
Is this still true? Every once in a while someone sends a link around to some madman explaining how race or economics or whatever "really" works and it's like a full dissertation with headings, footnotes, clip art. They're halfway to reinventing Grok-o-pedia right there in Twitter. I mean X. I was promised that "X gonna give it to you" but it turns out "it" is some form of brain chlymidia.
- 3rodents 13 hours ago
  
  Elon was running some sort of $1m competition for the “best” Twitter post for a few months. I think those type of dissertations about Phrenology and the like have fallen off a cliff since the competition ended.
  
  1 reply →
- delecti 5 hours ago
  
  There's probably a selection bias involved. I haven't been a regular user for a while now, but the big threads like that were significantly outnumbered by individual posts. Meanwhile I'm not likely to send a link to someone of a single single-sentence tweet, because there's not enough meat to it. The stuff that could be shared would usually be an image from the tweet, which I could share directly.
aleph_minus_one 13 hours ago

> Twitter's communication style [...] seems particularly unlikely to be helpful for optimising LLMs
This depends on what one wants to optimize the AI for. ;-)
libertine 13 hours ago
And the amount of bots there isn't helpful either.
- facemelt2 13 hours ago
  
  recent changes in their comment system have reduced my exposure to bots to a level I much prefer over every other platform I use
  
  3 replies →

UncleOxidant 13 hours ago

> Giant waste of time while Anthropic/OAI keep surging forward.

And Google. They're quietly making a lot of progress in the coding space with antigravity and Gemini 3.1.

koakuma-chan 13 hours ago
Has Antigravity gotten any better?
- sunaookami 12 hours ago
  
  It has gotten worse and they tightened the limits for paying customers recently: https://x.com/antigravity/status/2031835833716625883 (only announcement on Twitter, not in the app nor via email)
  
  1 reply →
- htrp 8 hours ago
  
  >There is currently no support for:
  >Bring-your-own-key or bring-your-own-endpoint for additional rate limits >Organizational tiers in general availability, or via contract[1]
  Literal clown car product.
  No plan for serious enterprise support (even 6 months after launch)
  [1]https://antigravity.google/docs/plans
- UncleOxidant 10 hours ago
  
  I find it pretty good. And Gemini 3.1 pro seems quite capable. Not as good at some things as Claude, but better at others. I was trying to target a verilog design to an uncommon FPGA and board and Gemini went out and searched for the FPGA docs and examined the schematics for the board in able to do the pin assignments (generated .ccf file). Not sure of Claude could've done that.
- BoredPositron 13 hours ago
  
  Probably the best value for a good amount of anthropic credits. You can also share your Google ai subscription with up to four family members and they all get the same amount of credits...

jmspring 13 hours ago

Twitter has the mass adoption, and it takes an effort to avoid bot/particular view bias - but as a valuable content source, it's a far cry from what it once was before Musk took it over.

ben_w 13 hours ago

> Feel like the canary was when Grokpedia became a project. Giant waste of time while Anthropic/OAI keep surging forward.

Really? I assumed that that whole thing was just a very direct `for each article in Wikipedia { article = LLM(systemprompt, article) }`

Agree re Twitter "good" != valuable.

sroussey 10 hours ago

Where system prompt lists a certain someone’s latest tweets.

sheepscreek 12 hours ago

AFAIK Grok still doesn’t have a CLI coding agent that works with a subscription. That’s a shame. Grok Code Fast 1 was pretty impressive when it came out - for what it did, and they never followed it up with a new version.

sroussey 10 hours ago

You can use cursor with grok, though my experience is that grok is the worst of the API providers cursor supports.

giancarlostoro 13 hours ago

> but I cannot imagine it's a valuable dataset.

It's going to be a mixed batch, but any time there's world events, since as far back as I can think, Twitter (now X) was always first in breaking news. There's plenty of people and news orgs still on X because they need to be for the audience.

samrus 11 hours ago

Twotter as a data source is interesting. I think it gets over hyped because thats elons grift. But i cant deny that the real time info aspect of it is pretty valuable. But i definitely think that its not that much more valuable than the open internet from a context source perspective. Everything worthwhile on twitter will end up elsewhere with a bit of lag. And the stuff that wont is noise anyway

laidoffamazon 10 hours ago

As someone trying to monitor the situation using Twitter the last few weeks it’s awful and it used to not be!

Rover222 9 hours ago
It’s flawed, but still the obvious place to monitor a situation.
- rchaud 9 hours ago
  
  It's long been taken over by Telegram, which among its other advantages (more like a message board than 'town square'), doesn't have hordes of people commenting "@grok explain this to me" under every post.

BurningFrog 12 hours ago

Grok is trained on pretty much the same giant web crawl/text corpus as the other AIs.

vibeprofessor 11 hours ago

[dead]

EGreg 12 hours ago

I'm not a fan of Elon's software endeavors, ever since he bought Twitter and turned it into an even worse cesspool of angry political nonsense than it used to be. I don't like how he's been biasing Grok, etc.

But, what exactly is so bad about Grokipedia? It's a different approach and I think a valid one: trying to do with AI what people have been doing manually at Wikipedia. I'm curious to hear the substantive comparisons.

kennywinker 11 hours ago
I think the issue is simply this: wikipedia trends towards unbiased info through use of the crowd. Grok, with a single owner with an ax to grind, trends towards whatever elon wants. It’s poisoned information under the control of one man - cyberpunk novels have been written about less.
- wat10000 11 hours ago
  
  A concrete example: a few weeks ago, Musk was making a big deal about how most of his massive net worth was not held in cash, and by a total coincidence the phrase "primarily derived from equity stakes rather than cash" showed up on his Grokipedia page in the section about net worth. I checked the pages of several other extremely wealthy people and none of them had such a comment.
- tmp10423288442 10 hours ago
  
  > wikipedia trends towards unbiased info through use of the crowd
  See, this is why people even give a project like Grokipedia the time of day. While in theory anyone can edit Wikipedia, in practice the moderators form a much smaller and weirder cabal, and they reject edits that go against their views. The frustration with the naive assertion that Wikipedia distills the wisdom of the crowds with the reality of Wikipedia on any page of note is what provides the psychic permission to even entertain a project with such obvious flaws as Grokipedia.
  
  3 replies →
Avshalom 9 hours ago

>>I don't like how he's been biasing Grok, etc.
>>But, what exactly is so bad about Grokipedia
sumeno 10 hours ago
It's controlled by a guy who spends all day retweeting white supremacists and lying about his companies. Why should anyone who isn't a white supremacist use it?
- baublet 6 hours ago
  
  They would not. The do not.