Borges and AI

1 year ago (arxiv.org)

Borges is one of my favourite writers. Having some stranger try to think like him to conceptualize something for public consumption rubs me the wrong way. If Borges was around, I'd love to hear what he had to say, but invoking his name and works like this feels off

The obvious parallel between Borges and AI is his Quixote story: a character endeavors to write Don Quixote, but spontaneously, word for word, without copying; to become cervantes in a sense, and to train their style so that they can produce it almost by accident. This sort of makes the work their own, and is a typical argument used by ai enthusiasts when AI regurgitates copywrited work--it isn't storing the work, it's becoming a thing that can spontaneously produce it. But this imho cheapens the story and the parallel isn't as strong as it could be

Thanks for all the downvotes

  • Pierre Menard, author of the Don Quixote. A masterpiece, like almost everything Borges wrote.

    • One of the most fascinating things about Borges writing is how he leads you into the fantastic realm without you even noticing.

  • One of the first things I did when someone showed me a pre-Chat version of GPT-3 was try to get it to speak in the voice of Borges. I had the same feeling of interesting but inappropriate.

    I was really happy to see this pop up and I'm glad someone went through this as a thought experiment but you make a good point that didn't initially occur to me.

    I feel like Borges' secular scifi-adjacent mysticism looses a lot of what makes it most meaningful when it's imitated or dissected academically.

    That said it does feel like Borges would probably be into the idea of being imitated.

    • I also tried this as soon as I got access to GPT-4 at Shopify's expense. "You are the celebrated magical realist writer Jorge Luis Borges. Write an essay about Shopify".

      What it produced was more of an essay written by a talented undergraduate about Borges, rather than Borges.

      I'm sure Borges would've found LLMs fascinating and one can only dream what stories and essays would've been written.

  • Pierre Menard is a magician. He is trying to create the ultimate form of reader: a reader who can read The Capital as a novel.

Collected Fictions[0] is a wonderful group of Borges stories that includes the ones mentioned in this article.

There are some amazing short story collections out there if you are the kind of reader that has a hard time staying with an entire novel. Ted Chiang has a couple[1][2] collections of stories that feel very Borges-like.

[0] https://www.librarything.com/work/25106 [1] https://www.librarything.com/work/28008 [2] https://www.librarything.com/work/23195758

  • Thanks, I will probably read that collection. Nice to see it includes the stories The Zahir and The Book of Sand which feel just as relevant as the ones mentioned in the paper.

  • It has been a long time since I read Borges, but I vaguely remember a story about a man about to die by firing squad, who pauses time and lives in an eternal moment by looking at a bee or something? I'm probably way off, but I really liked it.

    My favourite Ted Chiang story is the Tower of Babel story, what a rich world and satisfying pay-off.

Ok so the paper presents a central metaphor for reasoning about LLMs.

The metaphor: a book with all possible conversations/ human writings ever, and a [good] llm is finding the spot in the book that exactly matches the context and reads from the book as a response.

Certainly if you've experimented with a model that hasn't been fine-tuned (eg via rlhf) this metaphor will be resonant.

Is it useful?

(How does it help me understand LLMs with different capabilities? How does it help me understand models with different fine tunings?)

I would have thought the most relevant Borges story to LLMs would be Funes, the memorious.

Funes can remember everything perfectly, every detail that he has ever seen or heard in his life. However, he cannot really think or understand what he has seen, because understanding requires forgetting, it requires generalization.

LLM's can only think insofar as they need to generalize over their inputs. If it's just a memorizing parrot, that is not a good LLM.

At the risk of self-promotion, this is my take on Borges and AI... which is a song/video made by AI (jukebox and various previous-generation image generators), and based on Borges' fantastic essay "A New refutation of time."

song: https://songxytr.substack.com/p/borges-walks-into-a-discothe... original source: https://gwern.net/doc/borges/1947-borges-anewrefutationoftim...

The authors' own intention is to "understand.. LLMs and their connection to AI through the imagery of Jorge Luis Borges, a master of 20th century literature, forerunner of magical realism, and precursor to postmodern literature." In that spirit, I throw my hat into the ring too!

My first thought, https://en.m.wikipedia.org/wiki/The_Library_of_Babel

The op paper needs a conclusion.

  • That was my first thought as well. An LLM seems more like a search engine for the library only instead of indexing through topics it tries to find understandable non-sense in the sea of non-sense.

    The difficulty with them is that they can only find things similar to what they've been trained on.

    Lacking a curated index it's not going to find the information you need or want; it's going to find things that seem most like what others have seen or wanted.

  • I thought of https://en.wikipedia.org/wiki/On_Exactitude_in_Science and its relation to language, which is discussed by Baudrillard's Simulacra and Simulation. Large Language Models are the efforts of mapmakers to encapsulate reality in a way that is ultimately futile.

    • > Large Language Models are the efforts of mapmakers to encapsulate reality in a way that is ultimately futile.

      What's futile?

      Nobody is trying to "encapsulate all of reality". Trying to do that would be futile but succeeding would also be useless.

      1 reply →

  • Interesting how this work received an homage in The Name of the Rose, up to the name of the librarian, Jorge de Burgos. (I hated the way they pronounced the name in the movie, /'iorge/ instead of the correct Spanish way /'xorxe/)

Sorry to veer slightly off-topic, but can anyone familiar with academia explain why such a literary exercise gets published on Arxiv as a research paper? What is scientific or research-driven about it? How is this different from a long-form opinion or literary essay except for the fact that it's written with a paper-like style and voice? I'm baffled. Is this just because humanities professors need to show they're published as well and they need to get a score for tenure, or something like that?

  • Leon Bottou isn't a humanities professor, but a ML researcher. In fact, not just any ML researcher but arguably one of the ML researchers who most anticipated the current DL scaling era.

    Bottou was arguing for the virtues of SGD on the grounds of "CPUs [GPUs] go brrr" literally 2 decades ago in 2003: https://papers.nips.cc/paper/2003/file/9fb7b048c96d44a0337f0... or here he is in 2007/2012 explaining why larger models/data can scale and keep getting better: https://gwern.net/doc/ai/scaling/2012-bottou.pdf https://gwern.net/doc/ai/scaling/2013-bottou.pdf

    Which is not to say that he necessarily has anything worthwhile to say about 'Borges and AI' but I'm going to at least give it a read to see if there's something I might want to know 20 years from now. :)

    • Well the point stands, though, considering that the two papers you linked clearly read like papers. My question was mostly candidly naive and quite honest: what makes a paper a paper, when the content is something merely akin to a literary essay? I guess the answer is "the author". :D

  • arXiv is for researchers in different science and mathematics subcommunities to post pre-prints, surveys, reviews, lecture notes, manuscripts, documentation, etc... but also historical/archival research, philosophy papers, and meta-essays, as long as they are relevantly targeted to the subcommunity.

> Neither truth nor intention plays a role in the operation of a perfect language model. The machine merely follows the narrative demands of the evolving story. As the dialogue between the human and the machine progresses, these demands are coloured by the convictions and the aspirations of the human, the only visible dialog participant who possesses agency.

This is a really good way of thinking about these models. It reminds me of the recent-ish story where a reporter got really creeped out Bing's OpenAI powered chatbot (https://www.nytimes.com/2023/02/16/technology/bing-chatbot-m...). Reading that, I had thought the bot was relatively easily led into a narrative the reporter had been setting up. In a conversation between actual people who have their own will and agency, you don't get to see one leading the other around by the nose so completely.

Reframing the problem as one of picking through the many threads of potential fictions to evolve a story makes it easier to explain what happened in that particular case.

This is interesting but such a shame to miss/skip "Funes, the memorious." It will prove, I think, to be quite resonant in the future. But it is a damning parable and probably people just don't want hear that right now...

I just don't understand how people can attribute consciousness of some sort to the LLM but then with that belief not feel absolutely terrible for it and for what we do to it! I just think of poor poor Funes...

There is a reason Borges's Library of Babel contained all combinatorially possible texts, with almost all of them being pure gibberish. Borges was wise enough to understand that the following is meaningless, even for a story about a magic library:

"Imagine a collection that does not only contain all the texts produced by humans, but, well beyond what has already been physically written, also encompasses all the texts that a human could read and at least superficially comprehend."

To be clear this is a horrifically dishonest metaphor about LLMs. IMO the most glaring flaw in the technology is that they can't handle new ideas which don't appear in the training set. It is true that ChatGPT doesn't deal with this use case very often because it mostly handles trivialities. But it does mean that this entire argument is navel-gazing speculation.

The bigger problem is that the entire idea of "all texts a human could superficially comprehend" is meaningless, and the paper proceeds to reason based off this utter fallacy. The beauty of Borges's Library of Babel was that he realized that humans are capable of "superficially comprehending" any text, even if it was created by a uniform random ASCII generator. This is the basis of numerology, and why Borges's story included superstitious cult behavior of people destroying and/or sanctifying "meaningful" gibberish. If we have a good enough reason to find meaning in text, we'll find it. Humans don't actually rely on symbolic reasoning, we just use that for communication and organization: give us the symbols and we will reason about them, using cognition which is far too squishy to fit in a book. It's especially dangerous when the symbols obey human grammar and imitate social tones of authoritativeness, mysticism, etc.

And then there's...this:

"The invention of a machine that can not only write stories but also all their variations is thus a significant milestone in human history."

I am not a writer. But speaking as a homo sapien, it is genuinely insulting to call ChatGPT a machine that can write "all variations" of a story. This paper needed to be reviewed by a serious writer or philosopher before being put on the arXiv.

  • What are some examples of "new ideas"? I'm having a hard time imagining an idea that can't be expressed as a combination of existing concepts.

    Better concepts can arise when we make discoveries about reality (which takes experimentation), but there's a lot more juice to squeeze from the concepts we currently have.

    • "an idea that can't be expressed as a combination of existing concepts."

      The problem is that if an LLM hasn't been pretrained on the specific idea, it won't have a grasp of the what the correct concepts are to make the combination. It will be liable to substitute more "statistically likely" concepts, but since that statistic is based on a training set where the concept didn't exist, its estimation of "likely" is flawed.

      One good example is patents: https://nitter.net/mihirmahajan/status/1731844283207229796 LLMs can imitate appropriate prose, but really struggle to maintain semantic consistency when handling new patents for inventions that, by definition, wouldn't have appeared in the training set. But this extends to almost any writing: if you are making especially sophisticated or nuanced arguments, LLMs will struggle to rephrase them accurately.

      (Note that GPT-4 is still extremely bad at document summarization, even for uninteresting documents: your Q3 PnL number is not something that appeared in the training set, and GPT-4 is liable to screw it up by substituting a "statistically likely" number.)

      In my experience GPT-3.5 is extremely bad at F#: although it can do simple tasks like "define a datatype that works for such-and-such," it is much less proficient at basic functional programming in F# than it is Haskell - far more likely to make mistakes, or even identifiably plagiarize from specific GitHub repos (even my own). That's because there's a ton of Functional Programming 101 tutorials in Haskell, but very few in F#. I am not sure about GPT-4. It does seem better but I haven't tested it as extensively.

      1 reply →

    • They cannot be expected to produce useful new ideas because the ideas exist in lacunae in their probabilities: despite the extant possible novel combination of ideas (which isn’t the only option for new ideas: neologisms exist), the LLM has never seen it and so will (probabilistically) never produce it because it is equivalent to nonsense.

      The exception to this is if the new ideas are somehow present in the structure of language and are internalized and/or presented in an emergent form.

I've been eyeing some Borges books at a bookstore that is about 50 meters from me right now, coincidentally.

The company that maintains my favorite technology is named after a Borges story. I tried reading Borges a few years ago and found it insufferably uninteresting, but maybe now, with a pinch of external motivation, I'll enjoy him more.

This paper is in the Library, along with all the endorsements and refutations, as well as the discussion of it here.

So are:

- all the false discussions that deviated into a political topic that turned into a flame war

- the index of all the times dang has and will remind users to read the guidelines

The trick is finding them, and being sure you have.

I don't see the connection. I was a heavy B. reader in my day. But I remember him mentioning a Chesterton story where the machine eats its master. B. introduces me to Chesterton, to the Sartor Resartor, to the Bible, to Cthullu --- and then I can't even read enough English. Now, long after I've made B.'s break --- that it's all right, it's necessary --- I see how great his influence is in almost everything, bc culture is not a package of flooring you can buy one a day. It's a big warehouse of everything you can't buy, but in one lot.

  • Kafka wrote a little story like that: I won't quote it. They let them choose between being kings or being messengers for the kings. Because they were children, they all chose to be messengers for the kings, and now they were running all over the world carrying messages that nobody understood. Well, that was the Internet, wasn't it?

“open the era of Artificial Intelligence (AI)”

If ever there was a proof that “AI” doesn’t mean anything it’s this.

We’ve been living in the era of artificial intelligence since the 50s

People are waiting for the terminator to come in their house and dominate them before they actually agree that AI is a real thing

Basically, the colloquial definition of AI is “it can kill you based on its own desires and there’s nothing you can do about it”

  • > “it can kill you based on its own desires and there’s nothing you can do about it”

    haha, that's an original take, but makes sense after Terminator and Hal

    wondering if these movies have caused untold external consequences to humanity in its adoption of AI just to sell a few tickets

    to make a parallel, anti-vaxxers did their damage and caused many lives to be lost, similarly these stories, which are no better, can make people have a bad start with AI and sabotage their futures, or stall the benefits of AI from everyone else

    • Genuinely I think that’s the case.

      I have been in “AI” since 1998 when I was writing A* route planning for npcs in this new cool engine called Unreal.

      The only thing that has been consistent in all these years is that nobody thinks it’s AI unless it’s literally like Arnold Schwarzenegger in the terminator. I mean I’m not even exaggerating, it’s so ridiculously predictable that the goalposts for AI move the second whatever that particular technology becomes ubiquitous

      So for example, hog sift, surf etc. along with localization algorithms like slam type systems we’re so thoroughly in research when I started that they were considered a pillar of the field of AI. Now literally, no one would consider those AI because they do not use deep convolutional networks.

      So just like Marvin Minsky said AI is a suitcase term that doesn’t fucking mean anything. As somebody who’s been doing it for so long I’m used to it but it’s still annoying.

      So I’m just building the terminator and the counter terminator so we can move on.

      3 replies →

    • >wondering if these movies have caused untold external consequences to humanity in its adoption of AI just to sell a few tickets

      What are you saying? This isn't like when The Simpsons made fun of nuclear power and depicted it as doing impossible things. AGI is a hypothetical technology and we don't yet know what it could be capable of or even if it's feasible.

      >to make a parallel, anti-vaxxers did their damage and caused many lives to be lost, similarly these stories, which are no better, can make people have a bad start with AI and sabotage their futures, or stall the benefits of AI from everyone else

      Any idea can change a person's mind in one direction or another. Yours is an argument against the exchange of ideas in general. "Since hearing an idea could cause a person to $DO_BAD_THING, exchanging ideas (for example, by talking to people with $WRONG_OPINION, or by consuming fiction) is bad."

      2 replies →

  • Eh, when words lose functionality they either fall out of use or change meaning.

    AI basically means things brains and computers both do, but this is only a useful term when brains do those things better than computers. Usually once computers definitively surpass brains we've moved on to just calling that computing.

    Maybe that won't be the case and the term "AI" will either solidify as a broad category, or fall out of use, but it also might continue to refer to that-which-is-left-to-do, the things we're still better at than computers.