Comment by awei
2 days ago
I see how useful a universal UI language working across platforms is, but when I look at some examples from this protocol, I have the feeling it will eventually converge to what we already have, html. Instead of making all platforms support this new universal markup language, why not make them support html, which some already do, and which llms are already trained on.
Some examples from the documentation: { "id": "settings-tabs", "component": { "Tabs": { "tabItems": [ {"title": {"literalString": "General"}, "child": "general-settings"}, {"title": {"literalString": "Privacy"}, "child": "privacy-settings"}, {"title": {"literalString": "Advanced"}, "child": "advanced-settings"} ] } } }
{ "id": "email-input", "component": { "TextField": { "label": {"literalString": "Email Address"}, "text": {"path": "/user/email"}, "textFieldType": "shortText" } } }
A key challenge with HTML is client side trust. How do I enable an agent platform (say Gemini, Claude, OpenAI) to render UI from an untrusted 3p agent that’s integrated with the platform? This is a common scenario in the enterprise version of these apps - eg I want to use the agent from (insert saas vendor) alongside my company’s home grown agents and data.
Most HTML is actually HTML+CSS+JS - IMO, accepting this is a code injection attack waiting to happen. By abstracting to JSON, a client can safely render UI without this concern.
If the JSON protocol in question supports arbitrary behaviors and styles, then you still have an injection problem even over JSON. If it doesn't support them you don't need to support those in an HTML protocol either, and you can solve the injection problem the way we already do: sanitizing the HTML to remove all/some (depending on your specific requirements) script tags, event listeners, etc.
Perhaps the protocol, is then html/css/js in a strict sandbox. Component has no access to anything outside of component bounds (no network, no dom/object access, no draw access, etc).
I think you can do that with an iframe, but it always makes me nervous
Right this makes sense, I wonder if it would then be a good idea to abstract html to JSON, making it impossible to include css and js into it
Curious to learn more what you are thinking?
One challenge is you do likely want JS to process/capture the data - for example, taking the data from a form and turning it into json to send back to the agent
If you play with A2UIs generator that's effectively what it does, just layer of abstraction or two above what you're describing.
2 replies →