Comment by _alternator_

4 days ago

The “choosing the program” part is vague, but in many practical cases the user gets to choose the program by choosing input data.

This goes back to the rather fuzzy distinction between “data” and “program” you may remember from your early CS days. More precisely, from a theoretical CS perspective, there is no solid difference between data and program.

Almost all practical ZK schemes require the user to choose some input (eg the root of the merkle tree representing the “current state” of the blockchain and secrets like keys and the amount of a transaction).

From some perspective, you get a different program for each different input; sometimes people call this “currying” or “partial evaluation”.

So yeah, it’s more serious than it seems at first blush.

4 comments

_alternator_

Groxx 4 days ago

It seems to be pretty explicit that the "program" being run contains the full hashing algorithm used (to output the correct answers most of the time) plus additional logic to allow cheating.

That rather clearly goes wildly beyond what most ZK schemes use. That's arbitrary code execution of your choice, either as input or as part of selecting the program. Which seems like it puts this somewhere near "if you allow `eval` on user input in your script, it could do anything", doesn't it?

Plus like. They fixed it. That seems to imply it's more of an implementation flaw than a fundamental, even if it may be a surprisingly achievable one.

_alternator_ 4 days ago

So the proofs I’m most familiar with embed programs as polynomials over finite fields. Input data also corresponds to choosing some coefficients, and if you can choose enough coefficients (enough to embed the hash function) then the attack may be feasible.
The problem is compounded because the hash functions are typically chosen to have extremely short polynomial representations.

GTP 4 days ago

So, in the end, what is the core concept of the attack? Were they able to generate a program (maybe exloit the fuzzyness of the distinction between data and program to generate a program) that contains the hash of itself? I doubt that this is it, as if this were the case, then it would be likely to be an issue with a specific hash function and not a general issue. Unless they are using some trick like the one I presented above, but then it seems to me that the problem wouldn't be with Fiat-Shamir itself.

trod1234 4 days ago

My read of this is:
There were parts of the process that check whether its a valid proof, which were previously thought to be equivalent, which were in fact not the case. Computer scientists involved knew of cases where this was not the case but moved ahead anyway because no one would design the systems in the ways the attacks would work.
Attacks only get better.
The original process included the original hash in such systems inputs, but also allowed additional malicious features which could be included to rearrange the output in a way that passes the proof scheme checks despite it being incorrect.
By placing a constraint on input entropy they believe they've mitigated the issue, but it also breaks many applications; with no good alternative.
Imo, This is a weak assertion considering finite fields are used. The title is really misleading.
It should be "Fiat Shamir is broken"; Practical attacks.
These are finite fields so the token output generated doesn't necessarily correspond to the specific path taken.
There may be an infinite many paths, and computation has classical problems of computer science with being able to automatically derive decidable paths from tokens given; potentially leading to the same issues of discovering simple breaks.