← Back to context

Comment by dataflow

2 months ago

Let me put this differently.

If you tell me that you have a closet for your jackets and another closet for your shirts, you're telling me how clothes are laid out in your wardrobe. Specifically, you're telling me that you're laying those out separately, and able to deal with them independently, with little interference between the two. It's not the entirety of the layout information, but it sure is some of it.

If you tell me that you have a column for your first names and another column for your last names, you're telling me how names are laid out in your database('s files). Specifically, you're telling me that you're laying those out separately, and able to deal with them independently, with little interference between the two. It's not the entirety of the layout information, but it sure is some of it.

Sure -- in theory, you could be actually throwing everything together into a dumpster, then paying enough people to search it all in parallel when you want to retrieve that red jacket. If you're actually doing that, maybe you could legitimately claim that you haven't divulged anything about your closet's layout by telling me that shirts and jackets are separate. But chances are pretty darn good you're not actually doing that (and I would know this for a fact if I already somehow knew you were actually using closets built by Joe down the street), and thus actually are exposing layout information by telling me that you're storing them separately. One security implication of which is that, the moment that I get a glimpse of your closet and notice that it contains a shirt, I know it's not the one with the jackets, and I can skip it when trying to steal that expensive red jacket.

It's either a file layout or it is not a file layout. If you write an affidavit saying it's "sort of like a file layout", the conclusion will be that it is not one. Now, the Illinois Supreme Court found that it was a file layout (wrongly). But they didn't use any of this kind of message board logic to do it; they pulled up a definition for "file layout" from a technical dictionary (which, ironically, pretty clearly established, even more than this thread does, that schemas aren't file layouts), and then they pulled up a definition of "schema" from Mirriam-Webster, and the definition of "schema" was so abstract it could have matched anything.

If anybody on the Illinois Supreme Court had known what a schema actually was, we'd have won the case. Further, if the definition of "file layout" had been more material to the Chancery case, it would have been in the trial record that it wasn't one.

  • > Now, the Illinois Supreme Court found that it was a file layout (wrongly). But they didn't use any of this kind of message board logic to do it; they pulled up a definition for "file layout" from a technical dictionary (which, ironically, pretty clearly established, even more than this thread does, that schemas aren't file layouts)

    "Wrongly" was exactly what I just spent an hour writing a long comment disputing, with a detailed explanation. Specifically, with a real-world analogy between “a description of the arrangement of the data in a file” and “a description of the arrangement of the clothes in your closet.”

    • If I understand correctly, you're saying that you expect items in a column to tend to cluster near one another on disk. Notably though that doesn't give you any sort of relative or absolute offset. Neither does it have anything to say about, for example, blocks of different types which might be interleaved. Or compression. Or indexes. Or copy on write related garbage collection. Or journaling. Or any number of other things.

      Now if you wanted to argue that a schema serves the same purpose as a file layout, ie that it's how a programmer interfaces with the data, and that it impacts workload performance, that would be fair enough. And given that laws are all about intent perhaps that would be relevant. (Or perhaps not. I didn't read about the case yet.)

      But I think it's fairly reasonable to say that in typical usage an SQL schema is decidedly not a file layout in a literal sense.

      4 replies →