The point is not to sanitize known strings like "OutOfMemoryException". The point is to sanitize or (preferably) escape any untrusted data that gets logged, so that it won't be confused for something else.
Low tech example: escape all newlines in user supplied strings, then add a known prefix to all user supplied data (let's say a double hashtag ##, but anything else works too). When you want to search logs for strings coming from your system, remove/ignore everything after the marker.
It all comes down to understanding whether the intersection of two grammars is empty.
Absolutely incredible how dense HN can be and that no one has explained. Obviously that isn’t what they are saying, they are saying it’s profoundly stupid to have the server be controlled by a simple string search at all.
If structured logging is too much, unique prefixes solve this issue. Basically you need some token that user provided data is unable to output to the log. If you rigorously escape all newlines, you can then use start-of-line and end-of-line as unforgeable tokens. The possibilities are endless and it all comes down to understanding whether the intersection of two grammars is empty.
I read the gp to mean that error.log (being parsed to look for OOM) would have no associations with userSearches.log, in which an end-user searched for OOM
The point is not to sanitize known strings like "OutOfMemoryException". The point is to sanitize or (preferably) escape any untrusted data that gets logged, so that it won't be confused for something else.
i think GP's point is how would you even sanitize the string "OutOfMemoryException" which presumably comes from a trusted system
i guess demanding "Structured logs for everything or bust" is the answer? (i'm not a big o11y guy so pardon me if this is obvious)
"o11y" stands for "observability".
Numeronyms are evil and we should stop using them.
7 replies →
Low tech example: escape all newlines in user supplied strings, then add a known prefix to all user supplied data (let's say a double hashtag ##, but anything else works too). When you want to search logs for strings coming from your system, remove/ignore everything after the marker.
It all comes down to understanding whether the intersection of two grammars is empty.
1 reply →
Absolutely incredible how dense HN can be and that no one has explained. Obviously that isn’t what they are saying, they are saying it’s profoundly stupid to have the server be controlled by a simple string search at all.
An OutOfMemoryException log should not be the same as a search log
And
Should not be related in any way
Until someone searches for "Error: OutOfMemoryException"
If structured logging is too much, unique prefixes solve this issue. Basically you need some token that user provided data is unable to output to the log. If you rigorously escape all newlines, you can then use start-of-line and end-of-line as unforgeable tokens. The possibilities are endless and it all comes down to understanding whether the intersection of two grammars is empty.
I read the gp to mean that error.log (being parsed to look for OOM) would have no associations with userSearches.log, in which an end-user searched for OOM