← Back to context

Comment by alankay

10 years ago

What is "data" without an interpreter (and when we send "data" somewhere, how can we send it so its meaning is preserved?)

Data without an interpreter is certainly subject to (multiple) interpretation :) For instance, the implications of your sentence weren't clear to me, in spite of it being in English (evidently, not indicated otherwise). Some metadata indicated to me that you said it (should I trust that?), and when. But these seem to be questions of quality of representation/conveyance/provenance (agreed, important) rather than critiques of data as an idea. Yes, there is a notion of sufficiency ('42' isn't data).

Data is an old and fundamental idea. Machine interpretation of un- or under-structured data is fueling a ton of utility for society. None of the inputs to our sensory systems are accompanied by explanations of their meaning. Data - something given, seems the raw material of pretty much everything else interesting, and interpreters are secondary, and perhaps essentially, varied.

  • There are lots of "old and fundamental" ideas that are not good anymore, if they ever were.

    The point here is that you were able to find the interpreter of the sentence and ask a question, but the two were still separated. For important negotiations we don't send telegrams, we send ambassadors.

    This is what objects are all about, and it continues to be amazing to me that the real necessities and practical necessities are still not at all understood. Bundling an interpreter for messages doesn't prevent the message from being submitted for other possible interpretations, but there simply has to be a process that can extract signal from noise.

    This is particularly germane to your last paragraph. Please think especially hard about what you are taking for granted in your last sentence.

    • Without the 'idea' of data we couldn't even have a conversation about what interpreters interpret. How could it be a "really bad" idea? Data needn't be accompanied by an interpreter. I'm not saying that interpreters are unimportant/uninteresting, but they are separate. Nor have I said or implied that data is inherently meaningful.

      Take a stream of data from a seismometer. The seismometer might just record a stream of numbers. It might put them on a disk. Completely separate from that, some person or process, given the numbers and the provenance alone (these numbers are from a seismometer), might declare "there is an earthquake coming". But no object sent an "earthquake coming" "message". The seismometer doesn't "know" an earthquake is coming (nor does the earth, the source of the 'messages' it records), so it can't send a "message" incorporating that "meaning". There is no negotiation or direct connection between the source and the interpretation.

      We will soon be drowning in a world of IoT sensors sending context-or-provenance-tagged but otherwise semantic-free data (necessarily, due to constraints, without accompanying interpreters) whose implications will only be determined by downstream statistical processing, aggregation etc, not semantic-rich messaging.

      If you meant to convey "data alone makes for weak messages/ambassadors", well ok. But richer messages will just bottom out at more data (context metadata, semantic tagging, all more data) Ditto, as someone else said, any accompanying interpreter (e.g. bytecode? - more data needing interpretation/execution). Data remains a perfectly useful and more fundamental idea than "message". In any case, I thought we were talking about data, not objects. I don't think there is a conflict between these ideas.

      11 replies →

    • Isn't the interpreter code itself data in the sense that it has no meaning without something (a machine) to run it? How do you avoid having to send an interpreter for the interpreter and so on?

      6 replies →

    • I think object is a very powerful idea to wrap "local" context. But in a network (communication) environment, it is still challenging to handle "remote" context with object. That is why we have APIs and serialization/deserialization overhead.

      In the ideal homogeneous world of smalltalk, it is a less issue. But if you want a Windows machine to talk to a Unix, the remote context becomes an issue.

      In principle we can send a Windows VM along with the message from Windows and a Unix VM (docker?) with a message from Unix, if that is a solution.

      24 replies →

    • >Please think especially hard about what you are taking for granted in your last sentence.

      Any Meaning can only be the Interpretation of a Model/Signal?

  • Information in "entropy" sense is objective and meaningless. Meaning only exists within a context. If we think "data" represent information, "interpreters" bring us context and therefore meaning.

    • Thank you - I was beginning to wonder if anyone in this conversation understood this. It is really the key to meaningfully (!!) move forward in this stuff.

The more meaning you pack into a message, the harder the message is to unpack.

So there's this inherent tradeoff between "easy to process" and "expressive" -- and I imagine deciding which side you want to lean toward depends on the context.

Check this out for a practical example: https://www.practicingruby.com/articles/information-anatomy

(not a Ruby article, but instead about essential structure of messages, loosely inspired by ideas in Gödel, Escher, Bach)

So the idea is to always send the interpreter, along with the data? They should always travel together?

Interesting. But, practically, the interpreter would need to be written in such a way that it works on all target systems. The world isn't set up for that, although it should be.

Hm, I now realize your point about HTML being idiotic. It should be a description, along with instructions for parsing and displaying it (?)

  • TCP/IP is "written in such a way that it works on all target systems". This partially worked because it was early, partly because it is small and simple, partly because it doesn't try to define structures on the actual messages, but only minimal ones on the "envelopes". And partly because of the "/" which does not force a single theory.

    This -- and the Parc PUP "internet" which preceded it and influenced it -- are examples of trying to organize things so that modules can interact universally with minimal assumptions on both sides.

    The next step -- of organizing a minimal basis for inter-meanings -- not just internetworking -- was being thought about heavily in the 70s while the communications systems ideas were being worked on, but was quite to the side, and not mature enough to be made part of the apparatus when "Flag Day" happened in 1983.

    What is the minimal "stuff" that could be part of the "TCP/IP" apparatus that could allow "meanings" to be sent, not just bits -- and what assumptions need to be made on the receiving end to guarantee the safety of a transmitted meaning?