← Back to context

Comment by ycombinatrix

1 day ago

Why would someone "sanitize" OutOfMemoryException out of their logs? That is a silly point to make.

The point is not to sanitize known strings like "OutOfMemoryException". The point is to sanitize or (preferably) escape any untrusted data that gets logged, so that it won't be confused for something else.

  • i think GP's point is how would you even sanitize the string "OutOfMemoryException" which presumably comes from a trusted system

    i guess demanding "Structured logs for everything or bust" is the answer? (i'm not a big o11y guy so pardon me if this is obvious)

    • Low tech example: escape all newlines in user supplied strings, then add a known prefix to all user supplied data (let's say a double hashtag ##, but anything else works too). When you want to search logs for strings coming from your system, remove/ignore everything after the marker.

      It all comes down to understanding whether the intersection of two grammars is empty.

      1 reply →

Absolutely incredible how dense HN can be and that no one has explained. Obviously that isn’t what they are saying, they are saying it’s profoundly stupid to have the server be controlled by a simple string search at all.

An OutOfMemoryException log should not be the same as a search log

  Error: OutOfMemoryException

And

  Search: OutOfMemoryException

Should not be related in any way

  • Until someone searches for "Error: OutOfMemoryException"

    • If structured logging is too much, unique prefixes solve this issue. Basically you need some token that user provided data is unable to output to the log. If you rigorously escape all newlines, you can then use start-of-line and end-of-line as unforgeable tokens. The possibilities are endless and it all comes down to understanding whether the intersection of two grammars is empty.

    • I read the gp to mean that error.log (being parsed to look for OOM) would have no associations with userSearches.log, in which an end-user searched for OOM