← Back to context

Comment by kccqzy

17 hours ago

There are so many things that I would have done differently.

> We added a keyframes_only flag. We modified the video decoder to check FrameType::Idr. We set GOP to 60 (one keyframe per second at 60fps). We tested.

Why muck around with P-frames and keyframes? Just make your video 1fps.

> Now it’s 10Mbps of blocky garbage that’s still 30 seconds behind.

10 Mbps is way too much. I occasionally watch YouTube videos where someone writes code. I set my quality to 1080p to be comparable with the article and YouTube serves me the video at way less than 1Mbps. I did a quick napkin math for a random coding video and it was 0.6Mbps. It’s not blocky garbage at all.

> I occasionally watch YouTube videos

My experience is that at the same bitrate, real-time hardware encoding is way worse quality than offline CPU encoding (what YouTube does when you upload a video) so you can't compare them directly.

10 Mbps is still crazy high, and the target should still be around 1 Mbps.

This blog post smells of LLM, both in the language style and the muddled explanations / bad technical justifications. I wouldn't be surprised if their code is also vibe coded slop.

  • > I wouldn't be surprised if their code is also vibe coded slop.

    That's my takeaway from this too. I think they tried the first thing the LLM suggested, it didn't work, they asked the LLM to fix it, and ended up with this crap. They never tried to really understand the problems they were facing.

    Video is really fiddly. You have all sorts of parameters to fiddle with. If you don't dig into that and figure out what tradeoffs you need to make, you'll easily end up in the position where checks notes you think you need 40Mbps for 1080p video and 10Mbps is just too shitty.

    There's various points in the article where they talk about having 30 seconds of latency. Whatever's causing this, this is a solved problem. We all have experience dealing with video teleconferencing, this isn't anything new, it's nothing special, they're just doing it wrong. They say it doesn't work because of corporate network policy, but we all use Teams or Slack.

    I think you're right. They just did a bunch of LLM slop and decided to just send it. At no point did they understand any of their problems any deeper than the LLM tried to understand the problem.

Setting to 1 FPS might not be enough. GOP or P frame setting needs to be adjusted to make every frame keyframe.

  • Why would you do that?

    Nearly-static content is where you want even fewer keyframes than usual. In a situation like this you need them when the connection is interrupted and you reset things, and not much of anywhere else.

One man's not-blocky-garbage is another's insufferable hell. Even at 4k I find YouTube quality to be just awful with artefacts everywhere.