Comment by dataflow
2 months ago
I don't understand the argument that knowing the column names doesn't help an attacker? Especially in a database that doesn't allow wildcards, doesn't it make things much easier if you know you can do '); SELECT col FROM logins, as opposed to having to guess the column name?
And I don't think I disagree with the court on schema vs. file layouts either. It's not the file layout, but it's analogous: it tells you how the "files" (records) are laid out on the "file system" (database tables). For example, denormalization is very analogous to inlining of data in a file record. The notion that filesystems are effectively databases itself is a well known one too. How do you argue they aren't analogous?
You can always `SELECT table_name, column_name, data_type FROM information_schema.columns`, which is part of the SQL standard. https://www.postgresql.org/docs/current/infoschema-columns.h...
Plus, generally if you have SQL injection, you have multiple tries. You're not going to be locked out after one shot. And there's only so many combinations of `SELECT {id,userid,user_id,uid} FROM {user,users,login,logins,customer,customer}` before you find something useful.
> You can always `SELECT table_name, column_name, data_type FROM information_schema.columns`, which is part of the SQL standard. https://www.postgresql.org/docs/current/infoschema-columns.h.
You can "always" do that? Well I just did that. My database said: no such table: information_schema.columns
And what if my database had disabled this capability entirely?
Also, is there anything implying SQL here at all? Can't other databases with injection "capability" have schemas?
> Plus, generally if you have SQL injection, you have multiple tries. You're not going to be locked out after one shot.
No, you can't say it with such certainty at all. It really depends on what else you're triggering in the process of that SQL injection. You could easily be triggering something (like a password reset, a payment transaction...) where you're severely limited in your attempts.
> And there's only so many combinations of `SELECT {id,userid,user_id,uid} FROM {user,users,login,logins,customer,customer}` before you find something useful.
account, accounts, password, passwords, profile, profiles, credential, credentials, auth, auths, authentication, authentications, authentication_info, authentication_infos, authorization, authorizations, passwd, passwds, user_info, user_infos, login_info, login_infos, account_info, account_infos... should I keep going?
And these are just the logins/passwords; what if the information of interest was something else, like parking tickets?
Your reasoning and motivation is reductio ad absurdum. It does not make sense to base your system security on hiding from the public that your 'Users' table is called 'Users'. If you are vulnerable to this attack, the guilt rests on your deplorable application code, not whether or not your schema table names are known. If we should follow your logic, we would have to name our Users table U_ZER_CLEVER_S because naming it something people could guess would be a vulnerability.
> You can "always" do that? Well I just did that. My database said: no such table: information_schema.columns
Don't expect attackers to give up after one try. It depends on the database software, not everyone implements this exact ANSI standard for reflection but every database supports reflection. That's why the first step after finding a SQLi is to fingerprint the database software and go from there.
> And what if my database had disabled this capability entirely?
You can't disable it, lots of software, database features, ORMs and clients rely on reflection. If a client can query a table they also can retrieve metadata about that table.
4 replies →
There is one further problem with this entire sub-discussion: There are two mitigation strategies discussed:
- A: guaranteed SQL-injection-proof (SQL injection impossible.) - B: Having non-obvious table-names and 'secure-defaults' (e.g. INFORMATIONSCHEMA disabled).
So, the original commenter says, he wants to _hide the schema_, so that B can protect him in case of A. Well, failure of A is Amateur Hour. If you fail on A, I highly doubt you would have delivered correctly on B. To write it out in plain text: If you have set up and manage an application with SQL injection errors, I have a hard time seeing you still taking care to disable /enable obscure security defaults, or take care to avoid obvious and trivial table names.
Just to put icing on the cake: As soon as you have an SQL injection attack, a simple select * from randomTable or DESC randomTable would give you the table COLUMNS, so it utterly makes no sense to want to hide those column names - you have already lost them! (in the case you are arguing you need their protection in). ..Unless you argue that the guy making sql injection applications ALSO has set up a secure default to disallow select *..
In my experience, SQL injection is evidence of work of the sloppiest and immature nature; it was bad in 2003, and presumably still is.
1 reply →
That's a good point, has anyone hardened a database by locking out users who select columns that don't exist? Or run other dubious queries? This would obviously interrupt production but if someone is running queries on your db it's probably worth it?
I once did an security assessment for a product such as what you describe. Among other problems with it, the product itself had SQL injection vulnerabilities
For another example of what defenders are up against, see https://users.ece.cmu.edu/~adrian/731-sp04/readings/Ptacek-N.... This paper all but caused an upheaval in the WAF industry.
If you are mature enough to do that, you're mature enough to net SQL injections in the first place. There shouldn't be that many handwritten queries to review in the first place as most mundane DB access is usually through a framework that handles injection properly...
1 reply →
Zane Lackey (with Dan Kaminsky) gave a talk that discussed doing literally that sort of things, back in 2013. Zane went on to found Signal Sciences (acquired by Fastly), doing this sort of stuff in the 'WAF' space.
https://youtu.be/jQblKuMuS0Y?t=866 (timestamp is when Zane starts talking about it)
2 replies →
In the very early 2000’s I worked at a company building something along those lines. We could analyze SQL and SMB traffic on the fly and spot anomalous access to tables/columns/files, etc. Dynamic firewalling would have been the next progression if the company didn’t have other issues.
WAFs help with this, but at the HTTP level. By putting “information_schema”, “sys.tables” in the filters.
Not the real solution, IMO, but WAFs are useful for more than SQLi, and is the kind of tech you can ask money for.
On the surface that’s a very attractive idea.
A sort of “you shouldn’t be in here, even if we left the door unlocked.”
3 replies →
A good DBA would restrict the account so that it can't access the information schema. It's easy to imagine an environment with a vigilant DBA and less vigilant web developers.
This makes sense, but the the vast majority of tooling including ORMs, autocomplete SQL IDEs, and even suspect application code relies on table descriptions and listings provided by the information schema
2 replies →
Ah so what you're saying is that we ought to rename our logins table to "duckwords" because nobody will ever guess that? Also we should probably store passwords in plaintext but name the column "entercod3" because nobody will think of that. Oh and we should use printf with %s to build our queries right?
Being able to inject doesnt mean you get the output of a select. The inject can be on non-select statements.
The Department of Justice disagrees and voluntarily releases column and table names: https://www.justice.gov/afp/media/1186431/dl?inline=
> I don't understand the argument that knowing the column names doesn't help an attacker?
So Kevin Mitnick supposedly did most of his hacking using "social engineering". He'd call up some person, pretend to be in some other department within their organization, and ask them for some specific bit of information he needed to further his attack (or ask them to change some specific thing that would allow him to further his attack).
Would knowing the structure of Illinois governmental organizations help someone perform social engineering attacks against them? Yes, absolutely.
Should Illinois therefore keep the internal structures of their organizations -- the department names and the officials who run them -- secret? No, absolutely not.
First of all, if an attacker doesn't know them, they'll just use other social engineering attacks to figure them out; i.e., hiding the structure doesn't stop social engineering attacks, it just slows them down. Secondly, the value to the public of being able to navigate governmental structures far outweighs the cost of potential attacks.
This seems to me to be a direct analog: The "organizational structure" is the "database schema", and the "willingness to help a random person on the phone who seems to know what they're talking about" is the "SQL injection vulnerability". If an attacker knows the schema, their job is faster; but if they don't know the schema, they'll just use attacks to figure out the schema; so keeping it private doesn't stop an attack, only slow it down. And the benefit to the public of being able to issue FOIA requests far outweighs the cost of potential attacks.
> And I don't think I disagree with the court on schema vs. file layouts either.
I disagree that the law should prohibit disclosing "file layouts" but it's pretty clear that the law does block that, and I fundamentally agree with you that schemas are directly analogous to file layouts and thus restricted.
A SQL schema literally does not indicate the locations of data inside of a file. In fact, the whole reason schemas exist is to decouple the relationships between table rows and the pages and indexes that store that data. We had relational databases before SQL, and there are non-SQL relational (and non-relational) databases today, but you program them, at the query level, with code that is aware of what tables live where.
A schema is the opposite of a file layout. A schema is to a file layout what a Google search is to an IP address.
Let me put this differently.
If you tell me that you have a closet for your jackets and another closet for your shirts, you're telling me how clothes are laid out in your wardrobe. Specifically, you're telling me that you're laying those out separately, and able to deal with them independently, with little interference between the two. It's not the entirety of the layout information, but it sure is some of it.
If you tell me that you have a column for your first names and another column for your last names, you're telling me how names are laid out in your database('s files). Specifically, you're telling me that you're laying those out separately, and able to deal with them independently, with little interference between the two. It's not the entirety of the layout information, but it sure is some of it.
Sure -- in theory, you could be actually throwing everything together into a dumpster, then paying enough people to search it all in parallel when you want to retrieve that red jacket. If you're actually doing that, maybe you could legitimately claim that you haven't divulged anything about your closet's layout by telling me that shirts and jackets are separate. But chances are pretty darn good you're not actually doing that (and I would know this for a fact if I already somehow knew you were actually using closets built by Joe down the street), and thus actually are exposing layout information by telling me that you're storing them separately. One security implication of which is that, the moment that I get a glimpse of your closet and notice that it contains a shirt, I know it's not the one with the jackets, and I can skip it when trying to steal that expensive red jacket.
7 replies →
I dont think "file layout" has to mean the exact location of every byte. An abstract file layout is still a file layout.
6 replies →
> A SQL schema literally does not indicate the locations of data inside of a file.
That's only true if you apply eg the Unix definition of what a file on a file system is (like a sequence of bytes or whatever).
For all we know, the law might take a broader view. Something like: a 'file' is anything that in the olden days you would have stuck into a filing cabinet.
The 'Unix' definition isn't even particularly natural: it's one specific level of abstraction. On disk, the bytes aren't necessarily laid out one after another. Especially with fragmentation, compression and encryption going on.
An SQL schema tells you how data is laid out in a different layer of abstraction than the Unix view of bytes. But that view isn't the only one that the law can mean by 'file'.
>> And I don't think I disagree with the court on schema vs. file layouts either.
> I disagree that the law should prohibit disclosing "file layouts"
Note, the court wasn't ruling what the law should say, only what the law says. At least that's my understanding of it. I certainly wasn't opining on what the law should say.
Understood. I mention that distinction only because I find many people (not you) who say that "X law doesn't apply because if it did, it would be bad" vs directing your ire at the actual laws, which are poorly written and the legislators who are negligent in fixing those laws.
Courts should decide based on the law, not based on what is "good".
It seems like an unnecessarily ambiguous term.
Without additional context, I would interpret the term “file layout” to mean the file and directory structure of an application.
Such an application could potentially store data as plain files, the names of those files may contain personal or sensitive information.
> Without additional context, I would interpret the term “file layout” to mean the file and directory structure of an application.
I would interpret it to mean a description of what the file contains and where. This is information you need if you have a mysterious file and you want to parse it. It's also information you need if you have some data and you want to create a readable file that expresses it. But for the concept to apply to a database schema, (a) the database would have to be a file, and (b) the schema would have to specify where the information in the database is stored. That's difficult to do, since the schema has no knowledge of how much information there is in the database or how it might be written down.
> It seems like an unnecessarily ambiguous term.
Agree, and, I don't even understand why it's in there in the first place (it should just not be) but that's a job for the legislature to resolve, not the courts.
And this part seems self-defeating:
> Attackers like me use SQL injection attacks to recover SQL schemas. The schema is the product of an attack, not one of its predicates”.
If it's the product of an attack, but not the end goal, surely it's of value to the attacker?
It seems clear to me that the statute does, as worded, in principle allow the city not to disclose the database schema - it would compromise the security of the system, or at the very least, it would for some systems, so each request needs to be litigated individually.
The proposed amendment sounds like a good way to fix this - is it likely that will pass?
Lots of things are "of value". That's not the bar the statute sets. To the extent something isn't per se exempted by the statute (as the outcome of the case established schemas are), the burden is on the public body to demonstrate that disclosure Would jeopardize the security of the system.
It still seems like a massively gray area: despite the distinction between "would jeopardize" and "could jeopardize" as explained by TFA, the definition of "jeopardize" includes "danger" which means "could lead to harm" not "would lead to harm" at which point it hardly matters whether a thing "could endanger" or "would endanger" the security of the system.
11 replies →
> If it's the product of an attack, but not the end goal, surely it's of value to the attacker?
Well sure, but it doesn't help them attack. That's like arguing that since the bank robber wants dollar bills, dollar bills must be a useful tool for breaking into bank vaults.
If both sides agreed to the analogy of giving the bank robber the blueprints to the vault, I think any lay judge would agree that endangers the bank's security.
1 reply →
If you have an injection friendly application then that is the security problem.
Say someone hacks the db, is the problem easy to guess table names? The column should never have be called "passwords"?
Perhaps 30 years ago that would sound good.
Obscurity should hardly ever be a line of defense. If it is the only defense the problem isn't that it wasn't obscure enough.
Edit:
I'll do you one better. If you so much as suggest that obscurity is good security you actually openly invite people to fool around with your applications. The odds holes are to be found are much better than elsewhere.
What do you do when you know you've got a pile of poorly written insecure software and no money to improve it?
I probably delete everything and pretend it never happened. It depends ofc on the worse case scenario. What can i do/afford to deal with the greatest risk? I might use it on a machine without internet.
I agree with you. Knowing the exact column names can speed up an attack and, in some cases, make it more feasible.
Why don’t they just request disclosure of what’s actually stored and allow renaming of the columns? It seems odd that knowing the exact column names would be necessary if the goal is simply to understand what data is being stored and its intended purpose.
I wonder if that would be considered a "new report", which they don't have to provide.
They can either have their cake or eat it. If they don't want to obfuscate the column names, they have to provide the data with the original ones.
> Knowing the exact column names can speed up an attack and, in some cases, make it more feasible.
If I'm looking at a database, I like knowing column names, but I like knowing table names more.
>It's not the file layout, but it's analogous...How do you argue they aren't analogous?
laws don't get to be analogous
foia request: "I'd like the report the committee prepared about the costs for the new bridge"
response: "denied. the report contains costs laid out in tables with headings, which while not being schemas are analogous, with schemas not being files but being analogous"
Yeah, I think it's still useful info for an attacker. But only if the system was actually developed by amateurs who never heard of parameterized queries.
I find it a bit bizarre that the city uses "our system was developed with no consideration for security" as a valid defense.
'); SELECT * FROM logins --
Look everyone, it's Little Bobby Tables.
`Especially in a database that doesn't allow wildcards`
Such as...
This fails if either the UI sanitizes wildcards, or if the database prohibits them, or if it produces so much data that you can't ingest it in time, etc.
It also fails if the system was written using parameterized queries. I wouldn't expect a system to be sanitizing anything if fails to take the most basic step for db access. This whole discussion is only relevant for systems developed by amateurs. SQL injection can only work at all if you use string concatenation to create queries, which you should never do.
Injections don't always need ''. The statements
and
if injected into a query will give different answers if SQLI exists.
There are MANY other tricks that don't involve ''.
Besides, consider the number of valid queries done by the application that involve '*'. You are not going to turn that off.
Sanitization almost always fails. This becomes an arms race.
13 replies →
There are trivial ways around all of those. `LIMIT 1`, `SELECT .. FROM information_schema...`, etc.
2 replies →