← Back to context

Comment by PaulHoule

2 years ago

That's a touch beyond state of the art but we might get there.

If there was one big problem w/ today's LLMs it is that the attention window is too short to hold a "complete" document. I can put the headline of an HN submission through BERT and expect BERT to capture it but there is (as of yet) no way to cut up a document up into 512 (BERT) or 4096 (ChatGPT) token slices and then mash those embeddings together to make an embedding that can do all the things the model is trained to do on a smaller data set. I'm sure we will see larger models, but it seems a scalable embedding that grows with the input text would be necessary to move to the next level.

No, this is the current state of the art: https://supabase.com/blog/chatgpt-supabase-docs

  It's built with Supabase/Postgres, and consists of several key parts:
  
  Parsing the Supabase docs into sections.
  Creating embeddings for each section using OpenAI's embeddings API.
  Storing the embeddings in Postgres using the pgvector extension.
  Getting a user's question.
  Query the Postgres database for the most relevant documents related to the question.
  Inject these documents as context for GPT-3 to reference in its answer.
  Streaming the results back to the user in realtime.

The same thing could be done with search engine results and from recent demos it looks like this is the kind of analytic augmentation that MS and OpenAI have added to Bing.