Comment by rollcat

10 months ago

> bytestreams are less debuggable

Text streams are considered "better", because the standard UNIX userland (unlike e.g. DOS) provided excellent tools for dealing with text streams: grep, sed, awk, find, tr, etc and of course the shell itself.

But once you get your hands on excellent tools (like jq) for dealing with other kinds of data (like JSON), it turns out everything is even more powerful - you can now work with JSON as easily as with text, it just plugs into the existing ecosystem. But even though JSON has a human-readable text representation, it is no longer just text - it is dynamically-structured, but strongly-typed data. A JSON array is a JSON array, you can't just awk it.

There are byte stream formats (e.g. msgpack) that have feature parity with JSON. jq can't accept msgpack-encoded byte streams, but suppose a hypothetical tool, msgpack2json, is widely available - just plug it into jq. You're still working on the same level of abstraction (shell pipes), but easily dealing with complex byte streams.

And of course, what we understand as "text" in the modern era, are UTF8-encoded byte streams. If your "text" kit deals with ASCII rather than Unicode runes, it's that much less powerful, and likely full of painful edge cases that you now have to debug. (Note that UTF is a 1992 thing, it's been invented when UNIX was 20-something yro, and it's been around for 30+ years.)

Debuggability of anything is entirely up to your toolkit, the quality and comprehensiveness of that toolkit is what decides the battle.

0 comments

rollcat

No comments yet

Contribute on Hacker News ↗