Comment by naasking

2 months ago

> What I realized is that lower costs, and therefore lower quality,

This implication is the big question mark. It's often true but it's not at all clear that it's necessarily true. Choosing better languages, frameworks, tools and so on can all help with lowering costs without necessarily lowering quality. I don't think we're anywhere near the bottom of the cost barrel either.

I think the problem is focusing on improving the quality of the end products directly when the quality of the end product for a given cost is downstream of the quality of our tools. We need much better tools.

For instance, why are our languages still obsessed with manipulating pointers and references as a primary mode of operation, just so we can program yet another linked list? Why can't you declare something as a "Set with O(1) insert" and the language or its runtime chooses an implementation? Why isn't direct relational programming more common? I'm not talking programming in verbose SQL, but something more modern with type inference and proper composition, more like LINQ, eg. why can't I do:

    let usEmployees = from x in Employees where x.Country == "US";

    func byFemale(Query<Employees> q) =>
      from x in q where x.Sex == "Female";

    let femaleUsEmployees = byFemale(usEmployees);

These abstract over implementation details that we're constantly fiddling with in our end programs, often for little real benefit. Studies have repeatedly shown that humans can write less than 20 lines of correct code per day, so each of those lines should be as expressive and powerful as possible to drive down costs without sacrificing quality.

26 comments

naasking

ndriscoll 2 months ago

You can do this in Scala[0], and you'll get type inference and compile time type checking, informational messages (like the compiler prints an INFO message showing the SQL query that it generates), and optional schema checking against a database for the queries your app will run. e.g.

    case class Person(name: String, age: Int)
    inline def onlyJoes(p: Person) = p.name == "Joe"

    // run a SQL query
    run( query[Person].filter(p => onlyJoes(p)) )
    
    // Use the same function with a Scala list
    val people: List[Person] = ...
    val joes = people.filter(p => onlyJoes(p))

    // Or, after defining some typeclasses/extension methods
    val joesFromDb = query[Person].onlyJoes.run
    val joesFromList = people.onlyJoes

This integrates with a high-performance functional programming framework/library that has a bunch of other stuff like concurrent data structures, streams, an async runtime, and a webserver[1][2]. The tools already exist. People just need to use them.

[0] https://github.com/zio/zio-protoquill?tab=readme-ov-file#sha...

[1] https://github.com/zio

[2] https://github.com/zio/zio-http

naasking 2 months ago
Notice how you're still specifying List types? That's not what I'm describing.
You're also just describing a SQL mapping tool, which is also not really it either, though maybe that would be part of the runtime invisible to the user. Define a temporary table whose shape is inferred from another query, that's durable and garbage collected when it's no longer in use, and make it look like you're writing code against any other collection type, and declaratively specify the time complexity of insert, delete and lookup operations, then you're close to what I'm after.
- ndriscoll 2 months ago
  
  The explicit annotation on people is there for illustration. In real code it can be inferred from whatever the expression is (as the other lines are).
  I don't think it's reasonable to specify the time complexity of insert/delete/lookup. For one, joins quickly make you care about multi-column indices and the precise order things are in and the exact queries you want to perform. e.g. if you join A with B, are your results sorted such that you can do a streaming join with C in the same order? This could be different for different code paths. Simply adding indices also adds maintenance overhead to each operation, which doesn't affect (what people usually mean by) the time complexity (it scales with number of indices, not dataset size), but is nonetheless important for real-world performance. Adding and dropping indexes on the fly can also be quite expensive if your dataset size is large enough to care about performance.
  That all said, you could probably get at what you mean by just specifying indices instead of complexity and treating an embedded sqlite table as a native mutable collection type with methods to create/drop indices and join with other tables. You could create the table in the constructor (maybe using Object.hash() for the name or otherwise anonymously naming it?) and drop it in the finalizer. Seems pretty doable in a clean way in Scala. In some sense, the query builders are almost doing this, but they tend to make you call `run` to go from statement to result instead of implicitly always using sqlite.
  
  1 reply →

mike_hearn 2 months ago

Hm, you could do that quite easily but there isn't much juice to be squeezed from runtime selected data structures. Set with O(1) insert:

    var set = new HashSet<Employee>();

Done. Don't need any fancy support for that. Or if you want to load from a database, using the repository pattern and Kotlin this time instead of Java:

    @JdbcRepository(dialect = ANSI) interface EmployeeQueries : CrudRepository<Employee, String> {
        fun findByCountryAndGender(country: String, gender: String): List<Employee>
    }

    val femaleUSEmployees = employees.findByCountryAndGender("US", "Female")

That would turn into an efficient SQL query that does a WHERE ... AND ... clause. But you can also compose queries in a type safe way client side using something like jOOQ or Criteria API.

naasking 2 months ago
> Hm, you could do that quite easily but there isn't much juice to be squeezed from runtime selected data structures. Set with O(1) insert:
But now you've hard-coded this selection, why can't the performance characteristics also be easily parameterized and combined, eg. insert is O(1), delete is O(log(n)), or by defining indexes in SQL which can be changed at any time at runtime? Or maybe the performance characteristics can be inferred from the types of queries run on a collection elsewhere in the code.
> That would turn into an efficient SQL query that does a WHERE ... AND ... clause.
For a database you have to manually construct, with a schema you have to manually and poorly to an object model match, using a library or framework you have to painstakingly select from how many options?
You're still stuck in this mentality that you have to assemble a set of distinct tools to get a viable development environment for most general purpose programming, which is not what I'm talking about. Imagine the relational model built-in to the language, where you could parametrically specify whether collections need certain efficient operations, whether collections need to be durable, or atomically updatable, etc.
There's a whole space of possible languages that have relational or other data models built-in that would eliminate a lot of problems we have with standard programming.
- mike_hearn 2 months ago
  
  There are research papers that examine this question of whether runtime optimizing data structures is a win, and it's mostly not outside of some special cases like strings. Most collections are quite small. Really big collections tend to be either caches (which are often specialized anyway), or inside databases where you do have more flexibility.
  A language fully integrated with the relational model exists, that's PL/SQL and it's got features like classes and packages along with 'natural' SQL integration. You can do all the things you ask for: specify what operations on a collection need to be efficient (indexes), whether they're durable (temporary tables), atomically updatable (LOCK TABLE IN EXCLUSIVE MODE) and so on. It even has a visual GUI builder (APEX). And people do build whole apps in it.
  Obviously, this approach is not universal. There are downsides. One can imagine a next-gen attempt at such a language that combined the strengths of something like Java/.NET with the strengths of PL/SQL.
  
  4 replies →
- jimbokun 2 months ago
  
  Why aren’t you building these languages?

bflesch 2 months ago

Your argument makes sense. I guess now it's your time to shine and to be the change you want to see in the world.

naasking 2 months ago
I wish I had the time... always "some day"...
- jimbokun 2 months ago
  
  Thus the answer to your question of why those languages don’t exist.
  
  1 reply →

gus_massa 2 months ago

Isn't this comprehension in Python https://www.w3schools.com/python/python_lists_comprehension.... ?

dragandj 2 months ago

Clojure, friend. Clojure.

Other functional languages, too, but Clojure. You get exactly this, minus all the <'s =>'s ;'s and other irregularities, and minus all the verbosity...

rjbwork 2 months ago

I consider functional thinking and ability to use list comprehensions/LINQ/lodash/etc. to be fundamental skills in today's software world. The what, not the how!

naasking 2 months ago
Agreed, but it doesn't go far enough IMO. Why not add language/runtime support for durable list comprehensions, and also atomically updatable ones so they can be concurrently shared, etc. Bring the database into the language in a way that's just as easily to use and query as any other value.
- rjbwork 2 months ago
  
  Well, you can do that with LINQ + EF and embedded databases like SQL Lite or similar.
  
  3 replies →

jg0r3 2 months ago

Could you link any of these studies?

I couldn't find anything specific when searching.

naasking 2 months ago

Might be a good place to start, some citations and calculations in the replies there:
https://softwareengineering.stackexchange.com/a/450699
I did read actual studies that were conducted years ago, but don't have access to them at this point.