← Back to context

Comment by thatguymike

3 days ago

I'm sympathetic, but I do think Realtalk could be improved with some simple object recognition and LLMing.

One of the challenges I found when I played with RealTalk is interoperability. The aim is to use the "spacial layer" to bootstrap people's intuitions on how programs should work, and interact with the world. It's really cool when this works. But key intuitions about how things interact when combined with each other, only work if the objects have been programmed to be compatible. A balloon wants to "pop if it comes into contact with anything sharp". A cactus wants to say "I am sharp". But if someone else has programmed a needle card to say "I am pointy", then it won't interact with the balloon in a satisfying way. Or, to use one of Dynamicland's favorite examples: say I have an interactive chart which shows populations of different countries when I place the "Mexico card" into the filter spot. What do you think should happen if I put a card showing the Mexican flag in that same spot, or some other card which just says the string "Mexico" on it? Wouldn't it be better if their interaction "just works"?

Visual LLMs can aid with this. Even a thin layer which can assign tags or answer binary questions about objects could be used to make programs massively more interoperable.

That's similar to the issue with the whole NFT craze where you'd "take items from one game to another", it requires everything to work with everything.

For Dynamicland I get the issue though putting the whole thing through an LLM to make pointy and sharp both trigger the same effects on another card would just hide the interaction entirely. It could or couldn't work for reasons completely opaque to both designer and user.

The way you'd figure this out in dynamic land is you'd look at the balloon, which by custom would have the code taped on somewhere. You'd read that code, figure out what it's looking for, and write said trigger.

  • I just realized that the OP said they had already played around with real talk, but it's too late to edit: sorry for assuming that the printed code is sufficient! Was term mismatch one of the biggest issues you ran into, and if so, was it that the printed code didn't contain enough information?