← Back to context

Comment by qingcharles

1 year ago

Multi-modal was the absolute game-changer.

Just last night I was digging around in my basement, pulling apart my furnace, showing pics of the inside of it, having GPT explain how it works and what I needed to do to fix it.

I would never trust an LLM to do this unless it was pointing me to pages/sections in a real manual or reputable source I could reference.

  • I admire your optimism that good manuals and reputable sources exist for the average furnace in the average basement.

    • If there are no reputable sources to point to, then where exactly is GPT deriving its answer from? And how can we be assured GPT is correct about the furnace in question?

      2 replies →

    • I would accept a link to a YouTube video with a timestamp. Just something connected to the real world.

Oh right, yeah I've done things like this (phone calls to ChatGPT) or the openwebui Whisper -> LLM -> TTS setup. I thought there might be something more than this by now