← Back to context

Comment by pavon

2 months ago

Great read. Frustrating that the court ruled that a schema was a file layout, since I don't think it is, but at the same time if it didn't fall under that exception, there is a strong arguments that would be considered "documentation pertaining to all logical ... design of computerized systems". A schema is literally, the logical design of the database, and the database is a part of the computerized system. Once it was ruled that those examples are "per se" exempt it was a long shot to argue that schema wasn't covered by any of the examples.

I completely agree with you that (unlike/despite the Supreme Court ruling), database table/column schema design (and other system designs) should fall under the Illinois statute as "documentation pertaining to all logical and physical design of computerized systems". It's interesting that the law did pick up on that distinction between logical and physical design but none of the parties described in this article did. Logical/physical designs are not just about servers and integrations, they are also about data.

I'm not sure why that wasn't argued by the state and the state argued the database schema was a "file format". Per my reasoning, the state still would have won, but for different reasons.

I disagree with you slightly however and would say that the schema table/column names should be considered not logical but "physical design" while the business naming/meaning of tables would be a "logical design" (or conceptual design). See Wikipedia: https://en.wikipedia.org/wiki/Logical_schema

SQL injection is really about physical schema designs, not logical ones (I do get that every bit of information including business naming of tables/columns helps in an attack, but it does change the degree of threat and thus the balancing tests of the risk which are relevant per the definitions and case law described in the original article.)

So in terms of what the law /SHOULD/ be, the law should not include logical design as a security exception, only physical design. It /SHOULD/ be possible for citizens to do FOIA requests and get a logical understanding of all the database fields without giving them the SQL names that can accelerate SQL injection attacks. In that way citizens could ask for the data by a logical/business-named handle rather than a physical one.

And the state should create logical models or provide data dictionaries with business (not technical terms) on request as part of their FOIAable obligations to their citizens for the data they are maintaining.

My 2 cents as someone designing database schemas for 25+ years.

Schema is definitely software, a operating protocol, source code, and file layout. Maybe also documentation.

  • A schema isn't software in the sense imagined by the ILGA. If it was, every Excel spreadsheet would be too, and Excel spreadsheets are the basic currency of FOIA.

    An "operating protocol" is a step-by-step list of things to accomplish some action. It's a finite state machine for humans. Obviously, a schema isn't that; a schema is declarative, and an operating protocol is imperative.

    The court definitively established that SQL schemas aren't source code in the sense imagined by the ILGA. SQL queries can be. Schemas are not.

    See downthread for why a schema isn't a file format. In fact, a schema is almost the opposite of a file format.

    A court will look at the term "documentation" in the ordinary sense of the word; as in, "a prose description and set of instructions".

    "Associated with automated data processing operations" isn't an element in the statute; it's a description of all of the elements.

    • If the Excel spreadsheet has formulas in it, it's software. If you're just talking about the data in the sheet, i.e. what you'd get exporting it as a CSV, then it's not.

      Col types, unique/FK/PK constraints, default values, and computed cols define the steps for handling row inserts/updates/deletes. Even adding a uniqueness constraint to an already-unique col will change how the code interacts with it, specifically how it deals with concurrency/locking. If they said it has to be an imperative programming language, then it's not that.

      If they said the schema isn't source code then ok, but I still think it is.

      6 replies →

  • I think a schema will definitely be part of the source listing, either in the main programming language source code or in a some other file used to define or initialize the database. But I don't think it is software, any more than a protocol is software. Software does something.

    One tricky aspect of this is that even if the schema itself as a higher level concept doesn't fit into any of those definitions, all existing instances of the schema are likely considered either source listings or documentation. So the instances are barred from release per se, and you can't ask the government to create new documents.

    • The schema defines how the DBMS sets up its tables and such, so it does quite a bit imo. And if the schema isn't stored in any doc cause just manually punched in CREATE TABLE once, yeah what you said about creating new docs.

How is a database schema not a file layout?

  • The article describes why. 2 different db engines (or even instances) can use different file layouts for the same schema.

    In many was sql is all about divorcing the schema from the files.

    • But on the other hand, in all database systems the schema is used to determine how the files are laid out. Although I suppose the same thing could be argued for any data that is stored in a file, excepting that a schema is metadata that determines the organisation of data so it's a bit of a special case.

      7 replies →

  • It literally does not describe a file, and does not literally describe the data layout of anything on disk (though with enough knowledge, you may be able to infer facts about probable layouts).

    • > does not literally describe the data layout of anything on disk

      Huh? Depends on the DMBS, but each InnoDB table is a file.

      And the schema determines the file structure.

      4 replies →

  • The schema describes the database layout. The file layout (if you were going to call it that) in a modern RDBMS would describe how the RDBMS implemented a particular database layout as described by the schema.