← Back to context

Comment by potamic

2 days ago

The point of status codes is to have a standard that any client can understand. If you have a load balancer, the load balancer can unhealthy backends based on the status code. Similarly if you have some job scheduler or workflow engine that's calling your API, they can execute an appropriate retry strategy based on the status code. The client in most cases does not care about why something failed, only whether it has failed. Being able to tell apart if the failure was due to reverse proxy or database or whatever is the server's concern and the server can always do that with its own custom error codes.

> The client in most cases does not care about why something failed, only whether it has failed.

"...and therefore using different status codes in the responses is mostly pointless. Therefore, use 200 and put "s":"error" in the response".

> Being able to tell apart if the failure was due to reverse proxy or database or whatever is the server's concern.

One of the very common failures is for the request to simply never reach "the server". In my experience, one of the very first steps in improving the error handling quality (on the client's side) is to start distinguishing between the low-level errors of "the user has literally no connection Internet" and "the user has connected somewhere, but that thing didn't really speak the server protocol", and the high-level errors "the client has talked with the application server (using the custom application protocol and everything), and there was an error on the application server's side". Using HTTP-status codes for both low- and high-level errors makes such distinctions harder to figure out.

  • I did say most cases, not all cases. There are some concerns that are considered cross cutting, to have made it into the standard. For instance, many clients will handle a 401 by redirecting to an auth flow, or handle a 429 rate limited by backing off before making a request, handle 426 by upgrading the protocol etc. Not all statuses may be relevant for a given system, you can club several scenarios under a 400 or a 500 and that's perfectly fine for many use cases. But when you have cross cutting concerns, it's beneficial to follow fine grained status codes. It gives you a lot of flexibility in how you can connect different parts of your architecture and reduces integration headaches.

    I think a more precise term for what you're describing is transport errors vs business errors. You're right that you don't want to model all your business errors as HTTP status codes. Your business scenarios are most certainly numerous and need to be much more fine grained than what the standard offers. But the important thing is all errors business or transport eventually need to map to a HTTP status code because that's the protocol you're ultimately speaking.

    • > transport errors vs business errors

      Yes, pretty much.

      > But the important thing is all errors business or transport eventually need to map to a HTTP status code because that's the protocol you're ultimately speaking.

      "But the important thing is, all errors, business or transport, eventually need to map to the set of TCP flags (SYN, ACK, FIN, RST, ...) because that's the protocol you're ultimately speaking". Yeah, they do map, technically speaking: to just an ACK. Because it's a payload, transported agnostically to its higher-level meaning. It's a yet another application of the end-to-end principle.

what is a unhealthy request? is searching for a user which was _not found_ by the server unhealthy? was the request successful? thats where different opinions exist.

  • Sure, there's some nuance to it that depends on your application, but it's the server's responsibility to do so, not the client's. The status code exists for this reason and the standard also classifies status codes under client error and server error so that clients can determine whether a server is unhealthy simply by looking at the status code.