← Back to context

Comment by imvg

10 months ago

I agree. Just the usage of animations for explanations was a huge step forward. I wonder why the flagship ML/AI conferences have not adopted the distill digital template for papers yet. I think that would be the first step. The quality would follow

The quality would not follow because Distill.pub publications take literally hundreds of man-hours for the Distill part. Highly skilled man-hours too, to make any of this work on the Web reliably. (Source: I once asked Olah basically the same question: "How many man-hours of work by Google-tier web developers could those visualizations possibly be, Michael? 10?")

  • I've been wondering at what point AI assistants are going to reduce that to a manageable level? It's unfortunately not obvious what the main bottlenecks are, though Chris and Shan might have a good sense.

    • It might be doable soon, you're right. But there seems to be a substantial weakness in vision-language-models where they have a bad time with anything involving screenshots, tables, schematics, or visualizations, compared to real-world photographs. (This is also, I'd guess, partially why Claude/Gemini do so badly on Pokemon screenshots without a lot of hand-engineering. Abstract pixel art in a structured UI may be a sort of worst-case scenario for whatever it is they do.) So that makes it hard to do any kind of feedback, never mind letting them try to code interactive visualization stuff autonomously.

      2 replies →