Comment by bearjaws

14 hours ago

Feel like the canary was when Grokpedia became a project.

Giant waste of time while Anthropic/OAI keep surging forward.

I also keep hearing this narrative that Twitter is a good data source, but I cannot imagine it's a valuable dataset. Sure keeping up with realtime topics can be useful, but I am not sure how much of a product that is.

The Twitter social graph was an amazing data asset. I worked at a consumer insights firm and the data on followers/followings was quite powerful.

Using a custom taxonomy of things (celebrities, influencers, magazines, brands, tv shows, films, games, all kinds of things), we could identify groups of people who liked certain things, and when you looked at what those things were, it gave you a way of understanding who those people were.

With that data, you could work out:

- What celebrities/influencers to use in marketing campaigns - Where to advertise, and on which tv/radio channels - What potential brands to collaborate with to expand your customer base - What tone of voice to use in your advertising - In some cases, we educated clients about who their actual customers were, better than they understood themselves.

One scenario, we built a social media feed based on the things that a group of customers following a well-known Deodorant brand in the UK would see.

When we presented that to the client, they said “Why are there so many women in bikinis in this feed?”

The brand had repositioned themselves to a male-grooming focussed target market, but had failed to realise that their existing customer base were the ones that had been looking at their TV adverts of women on beaches chasing a man who happened to spray their Deodorant on them. Their advertising from the past had been very effective.

That was the power of Twitter’s data, and it is an absolute shame that Twitter went the way that it did. Mark Zuckerberg once said that Twitter was like “watching a clown car driven into a gold mine”.

I’m pretty sure he must be delighted with how things have panned out since.

  • That entire description sounds worthless to any positive direction of humanity. Therefore probably rapaciously profitable

    Very sad face.

  • This reads very dystopian. You are not optimizing to understand people, you are optimizing to weaponize that understanding against them.

    When you know what someone will buy based on exploiting their unconscious preferences, and you are paid to increase sales, you will do it. Especially if your competitors are doing it too.

    And this happens at scale, invisibly. People never see the manipulation.

    In any case, it is not useful for most people. It is useful for the people doing the deceiving.

    • The tech is interesting and useful, no need for the scary moral framing.

      The original application of the entire field of data science or ML is/was actually based on this paradigm of finding "unconscious preferences" (your words) and hidden patterns. How one chooses to deploy the tech should be judged on its own.

      On the current trajectory of tool/data abuse where Palantir et al. are leading the way, this is very low on the sinister scale.

      2 replies →

  • That Zuckerberg quote was published in 2013 and supposedly was made a year or more before. Was it about when Dick Costolo was CEO (2010-2012)?

  • It's definitely very valuable, but for what AI model? How does any of that lead to AGI, or even just a good coding agent?

    • It doesn't need to lead to AGI or a good coding agent. Some of the only people who are actually profitable in the LLM industry are the people making actual chatbots. There are several bootstrapped startups that run open-weight models with a $10 or $20 monthly sub and make millions in profit off of inference from people just talking to the things, usually for character roleplay / "AI boyfriend/girlfriend" stuff etc. Some of them even took those profits and invested it into training their own bespoke models from scratch, usually on the smaller side although finetunes/retrains of Llama 70b, GLM, and Deepseek 670b have also been done. Grok could probably be profitable if it targeted this space, as the most "intelligent" conversational/uncensored model.

      This is already presupposing that profit even matters, though. Musk already burned some $50 billion dollars to control messaging on political discourse with his acquisition of Twitter. It was not about money, but power. After you already have infinite money, the only thing left to spend it on is acquiring more power, which is achieved through influencing politics. LLMs represent a potentially even better propaganda tool than social media platforms. They give you unprecedented access to people's thoughts that they would probably not share online otherwise, and they allow you to more subtly influence people with deeply-personalised narratives.

    • > but for what AI model?

      Sentiment analysis. Working out what words lead to what outcomes, and then being able to predict on new data is super useful.

      For coding or "AGI" no, its not useful. For building a text based (possibly image based) recategorisation system top class.

  • As an aside that quote from MZ does bother me. There's more to making a web-scale human rights respecting (because it has to, it's the internet, social media needs guidelines) than just making money (which Zuck doesn't seem to care much about anyway if he's sinking apparently billions into metaverse while having no account support)

    Of course he would only see it through the lens of cash. I have no idea how profitable Twitter was under Dorsey but it felt the spirit of the company at first was relatively neutral, it was a tool, it was what Jack came up with

    Zuck replaced people's email addresses[1], the feed has been wildly unchronological for years. Fix some of those problems wrt. lack of user respect and maybe you can make statements like "all else being equal, clown car goal mine". Or was it "dumb fucks"[2]?

    [1] https://news.ycombinator.com/item?id=4151433 [2] https://news.ycombinator.com/item?id=1692122

  • It _was_ a great asset, however, just like models need proper data, as soon as musk removed the clamps on valuable social signals, well, he basically took a dump where he intended to eat.

It’s pretty telling that Elon had to have Grok rewrite Wikipedia because the truth was too woke for him. No idea how anybody can ever take Grok seriously.

  • Many projects in his companies seem to be more and more Musk's vanity projects than ideas/products one can take seriously. This is also how tesla ended up with a huge cybertruck stock that nobody wants to buy and thus had to be bought by his other companies. And it is becoming worse and worse, especially ever since he bought twitter and sped up his twitting rates.

  • Probably next generations of kids being fed PragerU studying material will. Something tells me we didn't see a fraction of what's going to happen in the decades to come.

  • I take Grokipedia very seriously as a threat to society. Sure, they're happy if people read it and fall for - but the primary goal is not to convince humans, but to influence search results of current models & to poison the training data of future models. ChatGPT (and most likely other models/providers too) is already using Grokipedia as a source, so unless you're aware of the possibility and always careful, you might be served Musks newest culture war ideas without ever being the wiser.

    It's not enough that everyone on Twitter is forced to read his thoughts, he's trying to make sure his influence reaches everyone else too.

  • I can both not like Elon and also think Wikipedia is also very captured on some things

Twitter's communication style being based around brevity, slang, memes, spam and non-threaded conversations seems particularly unlikely to be helpful for optimising LLMs

  • >Twitter's communication style being based around brevity

    Is this still true? Every once in a while someone sends a link around to some madman explaining how race or economics or whatever "really" works and it's like a full dissertation with headings, footnotes, clip art. They're halfway to reinventing Grok-o-pedia right there in Twitter. I mean X. I was promised that "X gonna give it to you" but it turns out "it" is some form of brain chlymidia.

    • Elon was running some sort of $1m competition for the “best” Twitter post for a few months. I think those type of dissertations about Phrenology and the like have fallen off a cliff since the competition ended.

      1 reply →

    • There's probably a selection bias involved. I haven't been a regular user for a while now, but the big threads like that were significantly outnumbered by individual posts. Meanwhile I'm not likely to send a link to someone of a single single-sentence tweet, because there's not enough meat to it. The stuff that could be shared would usually be an image from the tweet, which I could share directly.

  • > Twitter's communication style [...] seems particularly unlikely to be helpful for optimising LLMs

    This depends on what one wants to optimize the AI for. ;-)

> Giant waste of time while Anthropic/OAI keep surging forward.

And Google. They're quietly making a lot of progress in the coding space with antigravity and Gemini 3.1.

  • Has Antigravity gotten any better?

    • >There is currently no support for:

      >Bring-your-own-key or bring-your-own-endpoint for additional rate limits >Organizational tiers in general availability, or via contract[1]

      Literal clown car product.

      No plan for serious enterprise support (even 6 months after launch)

      [1]https://antigravity.google/docs/plans

    • I find it pretty good. And Gemini 3.1 pro seems quite capable. Not as good at some things as Claude, but better at others. I was trying to target a verilog design to an uncommon FPGA and board and Gemini went out and searched for the FPGA docs and examined the schematics for the board in able to do the pin assignments (generated .ccf file). Not sure of Claude could've done that.

    • Probably the best value for a good amount of anthropic credits. You can also share your Google ai subscription with up to four family members and they all get the same amount of credits...

Twitter has the mass adoption, and it takes an effort to avoid bot/particular view bias - but as a valuable content source, it's a far cry from what it once was before Musk took it over.

> Feel like the canary was when Grokpedia became a project. Giant waste of time while Anthropic/OAI keep surging forward.

Really? I assumed that that whole thing was just a very direct `for each article in Wikipedia { article = LLM(systemprompt, article) }`

Agree re Twitter "good" != valuable.

AFAIK Grok still doesn’t have a CLI coding agent that works with a subscription. That’s a shame. Grok Code Fast 1 was pretty impressive when it came out - for what it did, and they never followed it up with a new version.

  • You can use cursor with grok, though my experience is that grok is the worst of the API providers cursor supports.

> but I cannot imagine it's a valuable dataset.

It's going to be a mixed batch, but any time there's world events, since as far back as I can think, Twitter (now X) was always first in breaking news. There's plenty of people and news orgs still on X because they need to be for the audience.

Twotter as a data source is interesting. I think it gets over hyped because thats elons grift. But i cant deny that the real time info aspect of it is pretty valuable. But i definitely think that its not that much more valuable than the open internet from a context source perspective. Everything worthwhile on twitter will end up elsewhere with a bit of lag. And the stuff that wont is noise anyway

As someone trying to monitor the situation using Twitter the last few weeks it’s awful and it used to not be!

  • It’s flawed, but still the obvious place to monitor a situation.

    • It's long been taken over by Telegram, which among its other advantages (more like a message board than 'town square'), doesn't have hordes of people commenting "@grok explain this to me" under every post.

I'm not a fan of Elon's software endeavors, ever since he bought Twitter and turned it into an even worse cesspool of angry political nonsense than it used to be. I don't like how he's been biasing Grok, etc.

But, what exactly is so bad about Grokipedia? It's a different approach and I think a valid one: trying to do with AI what people have been doing manually at Wikipedia. I'm curious to hear the substantive comparisons.

  • I think the issue is simply this: wikipedia trends towards unbiased info through use of the crowd. Grok, with a single owner with an ax to grind, trends towards whatever elon wants. It’s poisoned information under the control of one man - cyberpunk novels have been written about less.

    • A concrete example: a few weeks ago, Musk was making a big deal about how most of his massive net worth was not held in cash, and by a total coincidence the phrase "primarily derived from equity stakes rather than cash" showed up on his Grokipedia page in the section about net worth. I checked the pages of several other extremely wealthy people and none of them had such a comment.

    • > wikipedia trends towards unbiased info through use of the crowd

      See, this is why people even give a project like Grokipedia the time of day. While in theory anyone can edit Wikipedia, in practice the moderators form a much smaller and weirder cabal, and they reject edits that go against their views. The frustration with the naive assertion that Wikipedia distills the wisdom of the crowds with the reality of Wikipedia on any page of note is what provides the psychic permission to even entertain a project with such obvious flaws as Grokipedia.

      3 replies →

  • >>I don't like how he's been biasing Grok, etc.

    >>But, what exactly is so bad about Grokipedia

  • It's controlled by a guy who spends all day retweeting white supremacists and lying about his companies. Why should anyone who isn't a white supremacist use it?