Comment by johnfn

2 months ago

I wrote a comment saying that this should be possible with a proper playwright harness and screenshot taking. My comment ended up in the negatives (though curiously no one stopped to explain why), as if I was saying something so absurdly inaccurate that it wasn’t even worth rebutting. Thank you for actually running the experiment and proving it - I was almost annoyed enough to do it myself.

I couldn’t understand why it had happened - it felt about as logical to my mind as writing a comment that Rust was faster than Node. I feel there is a strong anti-AI sentiment here, to the point that people will ignore evidence presented directly to them.

Personal vendetta aside, I enjoyed this post! You had some clever tricks I wouldn’t have considered. In fact, the idea of producing a pixel diff as output was particularly imaginative. And the bit about autoformalization definitely hits on something I’ve been feeling when working with AI recently.

EDIT: I notice my comment yesterday is in the positives. Please don’t vote it up. That was not my intention here.

53 comments

johnfn

simlevesque 2 months ago

There's a lot of LLM haters, simple as that.

DANmode 2 months ago

For posterity: Claude is no longer “just” an LLM you’re interacting with.
ls-a 2 months ago
I use AI everyday. But some AI adopters are getting a bit culty as well.
- johnfn 2 months ago
  
  But there is nothing culty about saying “an LLM could one-shot this” when it has clearly been demonstrated that an LLM can, in fact, one shot this!
  
  9 replies →

theahura 2 months ago

Note that I didn't even it tell it to use a pixel diff. Claude w/ Nori did that on its own by following the Nori TDD skill. I did very little, I'm actually very lazy :D

stanac 2 months ago
There is a quote about lazy developers, but I too lazy to search for it.
- eCa 2 months ago
  
  Laziness is one of the three virtues (of a good programmer), but I think Larry didn’t anticipate the current situation when he wrote it:
  ”The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it.”

fluidcruft 2 months ago

I didn't get downvoted yesterday but I got pretty far strongly hinting Claude should use very basic image processing approaches and it went for opencv very successfully. It was very fast on the image layout but failed pretty hard on the footer. This morning I decided to walk it through basic image processing for text detection and word building and that went pretty well but I didn't tell it what we were doing and it was too much me telling it what to do. It did sort of realize what we were doing at one point. I was thinking about trying again with just a nudge to think about using basic OCR image processing techniques to detect words and lines and see what Claude comes up with. Was also wondering what it would do if I just told it to use tesseract or paddleocr.

jack_h 2 months ago

Voting is meant to allow a community to police itself to some extent, although the downside to that is it incentivizes controlling the discussion over contributing to it. It’s a lot easier to vote in accordance with your own beliefs than articulate counter-arguments. The prisoner dilemma takes hold, people stop visiting as they get frustrated by the downvoting, and a bubble forms. It’ll be interesting to see how AI changes the online discussion landscape.

godelski 2 months ago
Also note that votes don't just mean agree/disagree. I frequently upvote comments I disagree with and downvote comments I agree with. The votes place the comments in the discussion ranking so plenty of people vote this way.
One example of this is I might like a conversation that's responding to a comment I don't like. But it's a common misunderstanding so I want the conversation to be boosted. Therefore I upvote the comment I disagree with because it's parent to the comments I want to be more visible.
I don't think I'm the only one that does this on HN. And I think doing this can help reduce the repeated comments. In the above example an early common misconception might get downvoted, not seen by others, and then repeated by some, where it can then rise because it's seen by a different subset.
Anyways, I don't think people should vote strictly on agree/disagree
- jack_h 2 months ago
  
  I also don’t think people should vote like that, the problem is there’s no way to enforce such a pattern[1]. Everyone benefits from voting as you describe, but that’s precisely where the prisoner’s dilemma comes into play. There’s also a size component to it. Small, tight knit communities tend to do well with voting, but as communities grow interactions become less personal, trust drops, and the incentive structure I described above becomes dominant. Voting essentially allows a community to establish its own Overton window distinct from what the official rules create, but that can be changed and constricted until a bubble is established[2]. I’ve seen it happen with countless communities across social media. Despite good intentions I think voting systems are a net negative to fostering good discussions and debate.
  [1] Maybe AI meta-moderation?
  [2] I don’t mean that this is happening intentionally by bad actors, merely that on average large groups produce outcomes that are dictated by incentives.

gaigalas 2 months ago

Dude, the recreation is a joke (hopefully an intentional one). It uses the screenshot instead of the assets.

Go ahead, turn on the Web Inspector, and remove the body background:

https://tilework-tech.github.io/space-jam/screenshot.png

Tiberium 2 months ago
The article mentions this:
> So it kind of cheated, though it clearly felt angst about it. After trying a few ways to get the stars to line up perfectly, it just gave up and copied the screenshot in as the background image, then overlaid the rest of the HTML elements on top.
- Palmik 2 months ago
  
  That does not make the title any less clickbaity. Moreover, it does not seem like a vindication of johnfn's original comment.
  
  1 reply →
- gaigalas 2 months ago
  
  The outcome does not justify @johnfn's redemption celebration. That's why I decided to give him a heads up.
  Aside from that, I think it's a joke. Like the value of pi example I gave in the other comment. If it's not, it is really just sad.
theahura 2 months ago
Please read the blog post!
- gaigalas 2 months ago
  
  It's a joke, right? A joke similar to this one:
  ---
  > Make me a python script that calculates the value of PI
```python
print("3.1415")
```
  "I think it's passable!" <--- The joke
  ---
  If it's not a joke, then it's just sad.
  
  4 replies →

jayd16 2 months ago

I haven't seen your original comment but "It could work if they did it better" is in general a low value comment.

johnfn 2 months ago
You should go read it and see if you can tell me a way I could improve it. I felt I gave actionable advice, but I’m always happy to know if I could have said things better.
- jayd16 2 months ago
  
  Looking at the comment, I would argue that it's fairly vague. Maybe it's clear if you have done it but not clear to others type thing.
  Then you undercut the advice by adding "I've always wondered if <confident suggestion> would work", making it unclear how much of the advice is a shot in the dark and how much you've actually seen results from that advice.
  Claims like "you might even one shot it" also make it seem like simple hype and not the war story of someone who's actually taken the advice.
  But you know, people are down voting me for engaging with your question as well so I don't know. Maybe it's all bots these days :p
  
  1 reply →
- kcatskcolbdi 2 months ago
  
  You could improve it by simply doing the thing you describe and linking to it.

Aldipower 2 months ago

It is a task that could be _easily_ done manually in much shorter time without AI, probably by developers who even love to develop. The reaction on this shouldn't be misjudged as anti-AI. A lot of people, including me, just do not get it! For scientific purposes? Ok, fair enough. But what is the further meaning of this exercise?

johnfn 2 months ago
The point is that if we agree that this task is truly a one shot, as long as you agree it’s faster to prompt than code, then while you “easily” do this task in around an hour (or however long you say it will take you), I’ll prompt Claude in around 5 minutes, and get a few more things done while I let it run in the background. What am I missing from your argument?
- Aldipower 2 months ago
  
  Reading the blog post, prompting Claude setting up Playwright etc. takes at least one hour maybe more? Not seeing where your 5 minutes coming from.
  
  10 replies →
fluidcruft 2 months ago

For me it's more that I'm not a web developer and it would definitely take me way longer to research all the parts of doing this. I have booksmarts (at best) about basic CSS and have given up trying to keep up with javascript anything.
BearOso 2 months ago

It's seemingly an experiment to see how an LLM performs when the task is just outside of its milieu. The answer is not very well.