Comment by rennokki

13 hours ago

> Uses jq for TB json files

> Hadoop: bro

> Spark: bro

> hive: bro

> data team: bro

JQ is very convenient, even if your files are more than 100GB. I often need to extract one field from huge JSON line files, I just pipe jq to it to get results. It's slower, but implementing proper data processing will take more time.