Comment by rscho

1 year ago

For using a heap of libs to compute something that has been done a thousand times? No. For writing a completely new data processing pipeline from scratch? Much, much faster! Array langs have decent performance, but that's not why they are used. They are used because you develop faster for the kind of problem they're good at. People always conflate those two aspects. I use J for scientific research.

Seems weird to switch to develop faster and complain about people conflating the two aspects when this thread is clearly talking about runtime performance, triggered by the benchmark claims:

> real-sql(k) is consistently 100 times faster (or more) than redshift, bigquery, snowflake, spark, mongodb, postgres, ..

> same data. same queries. same hardware. anyone can run the scripts.

  • GP here :)

    When talking about speed I was rolling the time taken to write the query and get the query to run with the run time. The total speed depends upon both.

    In another thread someone pointed out that being a good quant is also about having the best ideas in the first place.

    This is just a general comment on the whole comments section in general that I leave here:

    We have this weird situation where the general programming population looks at a small group of the most profitable programmers who are thinking about their domain problems in languages that are a mapping or mathematical symbols to a qwerty keyboard (in the old days there were APL keyboards). And the mainstream programmers say that is so weird that it must be wrong and must be a lie and so on.

    Occam’s razor says that those profitable programmers wouldn’t be buying K if the same results at lower TCO or better results at a higher price?

    In broader data engineering there has been tech like “I use scala!” that are used to gatekeep and recognise the in crowd. But that is in the faceless corporate end of enterprise data engineering where people are not measured in bottom lines.

    Sorry for venting :)

    • > most profitable programmers

      This more than anything demonstrates the hothouse-flower mentality of K stans. Quants have long since stopped being the best-paid or most value-generating engineers, and since K has zero application outside of quant, it's no longer even a particularly lucrative skill to acquire.

      It's interesting though that the opacity of the "I make more money than you" argument fits so snugly with other unverifiable and outdated claims of K supremacy, like performance, job security, or expressiveness.

      1 reply →

  • > Seems weird to switch to develop faster and complain about people conflating the two aspects when this thread is clearly talking about runtime performance, triggered by the benchmark claims

    It doesn't look to me like GP switched to develop and complained of conflation. The switch happened higher up the thread by wood_spirit, and GP just continued the conversation (and called out the tendency to conflate, without calling out a specific person).

    On a meta note, I wish this trend of saying "it seems weird" and then calling out some fallacy or error would die. Fallacies are extremely common and not "weird", and it comes off as extremely condescending.

    It happens quite frequently on HN (and surely other places, though I don't regularly patronize those). So to be clear, this isn't critcism levelled at you exclusively. (I even include myself as target for this criticism, as I've used the expression previously on HN as well, before I thought more about it).

    Firstly, in this case and in most cases where that expression is used, it's actually weird to call it weird[1]. Fallacies, logic errors, and other mistakes are extremely natural to humans. Even with significant training and effort, we still make those mistakes routinely.

    Secondly, it seems like it's often used as a veiled ad hominem or insult. It's entirely superfluous to add. In this case you could have just said "you complained about people conflating the two aspects and then conflated them yourself." (It still wouldn't have been correct as GP didn't conflate them, but it would have been more direct and clear).

    Thirdly, it comes off as condescending[2]. It's sort of like saying, "whoa dude, we're all normal and don't make mistakes, but you're weird and do make mistakes." In reality, we all do it so it's not weird at all.

    [1]: https://www.merriam-webster.com/dictionary/weird

      1: of strange or extraordinary character : ODD, FANTASTIC
      2: of, relating to, or caused by witchcraft or the supernatural : MAGICAL
    

    [2]: The irony of this is not lost on me. I can definitely see how this comment might also come off as condescending. I don't intend it to be, but it is a ridiculously long comment for such a simple point. It also included a dictionary check which is also frequently a charactersitc of condescending comments. I don't intend it to be condescending, merely reflective of self-analytical, but as is my theme here, we all make mistakes :-)

    • You can understand something fully and still call it weird. I’ve used perl for decades, some of the things it does are still just weird. As for fallacies, one of the first things I was taught in logic class was that using an argument’s formal structure (or lack thereof) to determine truth is itself a fallacy. Unsound != Untrue, and throwing around Latin names of fallacies doesn’t actually support an argument.

  • I'm not switching anything. Just trying to add to the conversation. BTW, for simpler queries I have no doubt the benchmarks are correct. I anticipate it would not hold for more beefy queries.

    • You came by it honestly! The initial conflation (and therefore context switch) happened further up-thread.

Yeah to be clear I meant performance-wise. In terms of development speed it looks like it just goes to insane extreme on the "fast at the beginning / fast in the middle" trade-off. You know how dynamic typing can be faster to develop when you're writing a one-man 100 line script, but once you get beyond that the extra effort to add static types means you overtake dynamic typing.

  • Performance-wise I'll admit I don't really know. But I do know that performance is at least decent, in that it lets you experiment easily with pretty much any array problem. Your dynamic/static typing analogy is pretty much on point.