Comment by maxdamantus

6 days ago

Interesting .. from the post above:

> The projects examined contained a total of 120,964,221 lines of Python code, and among them the script found 203 instances of control flow instructions in a finally block. Most were return, a handful were break, and none were continue.

I don't really write a lot of Python, but I do write a lot of Java, and `continue` is the main control flow statement that makes sense to me within a finally block.

I think it makes sense when implementing a generic transaction loop, something along the lines of:

  <T> T executeTransaction(Function<Transaction, T> fn) {
    for (int tries = 0;; tries++) {
      var tx = newTransaction();
      try {
        return fn.apply(tx);
      } finally {
        if (!tx.commit()) {
          // TODO: potentially log number of tries, maybe include a backoff, maybe fail after a certain number
          continue;
        }
      }
    }
  }

In these cases "swallowing" the exception is often intentional, since the exception could be due to some logic failing as a result of inconsistent reads, so the transaction should be retried.

The alternative ways of writing this seem more awkward to me. Either you need to store the result (returned value or thrown exception) in one or two variables, or you need to duplicate the condition and the `continue;` behaviour. Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.

If there are particular exceptions that should not be retried, these would need to be caught/rethrown and a boolean set to disable the condition in the `finally` block, though to me this still seems easier to reason about than the alternatives.

13 comments

maxdamantus

AdieuToLogic 6 days ago

> Having the retry logic within the `finally` block seems like the best way of denoting the intention to me, since the intention is to swallow the result, whether that was a return or a throw.

Except that is not the documented intent of the `finally` construct:

  The finally block always executes when the try block exits. 
  This ensures that the finally block is executed even if an 
  unexpected exception occurs. But finally is useful for more 
  than just exception handling — it allows the programmer to 
  avoid having cleanup code accidentally bypassed by a 
  return, continue, or break. Putting cleanup code in a 
  finally block is always a good practice, even when no 
  exceptions are anticipated.[0]

Using `finally` for implementing retry logic can be done, as you have illustrated, but that does not mean it is "the best way of denoting the intention." One could argue this is a construct specific to Java (the language) and does not make sense outside of this particular language-specific idiom.

Conceptually, "retries" are not "cleanup code."

0 - https://docs.oracle.com/javase/tutorial/essential/exceptions...

maxdamantus 6 days ago
Sounds like the right intent to me. To pinpoint your existing quote from the documentation:
> The finally block always executes when the try block exits. This ensures that the finally block is executed even if an unexpected exception occurs.
The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
I'm not sure how else to interpret that to be honest. If you've got a clearer way of expressing this, feel free to explain.
- locknitpicker 6 days ago
  
  > The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
  A transaction failing is the opposite of an unexpected event. Transactions failing is a central use case of any transaction. Therefore it should be handled explicitly instead of using exceptions.
  Exceptions are for unexpected events such as the node running out of memory, or a process failing to write to disk.
  
  1 reply →
- AdieuToLogic 5 days ago
  
  > The intent of the transaction code is that the consistency is checked (using `tx.commit()`) "even if an unexpected exception occurs".
  First, having a `commit` unconditionally attempted when an exception is raised would surprise many developers. Exceptions in transactional logic are often used to represent a "rollback persistent store changes made thus far" scenario.
  Second, using a condition within `finally` to indicate a retry due to a `commit` failing could be expressed in a clearer manner by having it within the `try` block as described by IntelliJ here[0].
  0 - https://www.jetbrains.com/help/inspectopedia/ContinueOrBreak...
  
  2 replies →

brabel 6 days ago

Doesn't that code ignore errors even if it runs out of retries? Don't you want to log every Exception that happens, even if the transaction will be retried?

This code is totally rotten.

maxdamantus 6 days ago

A result of an inconsistent transaction should be discarded whether it's a return value or a thrown exception. If it runs out of tries another error should be thrown. This should only happen due to contention (overlapping transactions), not due to a logical exception within the transaction.
You can add extra logging to show results or exceptions within the transaction if you want (for the exception this would simply be a `catch` just before the `finally` that logs and rethrows).
I've omitted these extra things because it's orthogonal to the point that the simplest way to express this logic is by having the `continue` control flow unconditional on whether the code was successful .. which is what you use `finally` for.
If you did this in Rust noone would complain, since the overall result is expressed as a first-class `Result<T, E>` value that can naturally be discarded. This is why Rust doesn't have `finally`.
Rust is also a lot more permissive about use of control flow, since you can write things like `foo(if x { y } else { continue }, bar)`.
Personally, I prefer when the language gives a bit more flexibility here. Of course you can write things that are difficult to understand, but my stance is still that my example code above is the simplest way to write the intended logic, until someone demonstrates otherwise.
I don't think this is a restriction that generally helps with code quality. If anything I've probably seen more bad code due to a lack of finding the simplest way to express control flow of an algorithm.
I'm sure there's some train of thought that says that continue/break/return from a loop is bad (see proponents of `Array.prototype.forEach` in JS), but I disagree with it.

Someone 6 days ago

> if (!tx.commit())

https://docs.oracle.com/javase/8/docs/api/java/sql/Connectio...:

  void commit()
     throws SQLException

⇒ this code I won’t even compile for the java.sql.Transaction” class that is part of the Java platform.

(I think having commit throw on error is fairly common. Examples: C# (https://learn.microsoft.com/en-us/dotnet/api/system.data.sql...), Python (https://docs.python.org/3/library/sqlite3.html#sqlite3.Conne...))

maxdamantus 6 days ago
I wasn't thinking of JDBC SQL transactions specifically, but sure, different APIs can denote retriable transaction failures differently. Instead of:
if (!tx.commit()) { continue; }
you do:
try { conn.commit(); } catch (SQLTransactionRollbackException e) { continue; }
and the principle still applies. The simplest solution still involves a `continue` within the `finally` block.
Whether it's a good idea to actually do this directly using SQL connections is another question .. SQL databases usually use pessimistic locking, where the transaction failures are actually "deadlocks" that are preferably avoided through careful ordering of operations within the transaction (or more commonly, YOLOing it using an isolation level that allows read anomalies). Without going into all the details, this probably has a large influence over the design of the SQL APIs you're referring to.

immibis 6 days ago

This code is wrong. You don't want to commit a transaction if an exception is thrown during the transaction.

maxdamantus 5 days ago

You want to at least check that the exception was raised in the absence of read anomalies. The check for read anomalies in OCC happens during the commit phase.
Setting a transaction to read-only on error is possible using the code (using a rethrowing catch within the transaction), but this is not universally desirable.
If you're using transactions to run fairly arbitrary code atomically (assuming no static state outside of the transaction), the expected behaviour would be that modifications leading up to an exception (in a non-anomalous transaction) are still persisted. Eg, imagine the code within the transaction is updating a persisted retry counter before performing a fallible operation. In this case you want the counter to be updated so that the transaction doesn't just fail an infinite number of times, since each time you roll back on error you're just restoring the state that leads to the error.
Another case would be where the exception is due to something that was written within the transaction. If the exception were raised but the writes were not persisted, it would at least be confusing seeing the exception, and possibly logically incorrect depending on how the irrelevant exception is handled (since it's due to data that theoretically doesn't exist).