Comment by naasking

2 months ago

Notice how you're still specifying List types? That's not what I'm describing.

You're also just describing a SQL mapping tool, which is also not really it either, though maybe that would be part of the runtime invisible to the user. Define a temporary table whose shape is inferred from another query, that's durable and garbage collected when it's no longer in use, and make it look like you're writing code against any other collection type, and declaratively specify the time complexity of insert, delete and lookup operations, then you're close to what I'm after.

2 comments

naasking

ndriscoll 2 months ago

The explicit annotation on people is there for illustration. In real code it can be inferred from whatever the expression is (as the other lines are).

I don't think it's reasonable to specify the time complexity of insert/delete/lookup. For one, joins quickly make you care about multi-column indices and the precise order things are in and the exact queries you want to perform. e.g. if you join A with B, are your results sorted such that you can do a streaming join with C in the same order? This could be different for different code paths. Simply adding indices also adds maintenance overhead to each operation, which doesn't affect (what people usually mean by) the time complexity (it scales with number of indices, not dataset size), but is nonetheless important for real-world performance. Adding and dropping indexes on the fly can also be quite expensive if your dataset size is large enough to care about performance.

That all said, you could probably get at what you mean by just specifying indices instead of complexity and treating an embedded sqlite table as a native mutable collection type with methods to create/drop indices and join with other tables. You could create the table in the constructor (maybe using Object.hash() for the name or otherwise anonymously naming it?) and drop it in the finalizer. Seems pretty doable in a clean way in Scala. In some sense, the query builders are almost doing this, but they tend to make you call `run` to go from statement to result instead of implicitly always using sqlite.

naasking 2 months ago

> In real code it can be inferred from whatever the expression is (as the other lines are).
What I meant is that there would be no explicit List<T> types, or array types, or hash tables, or trees, etc. Contiguity of the data is an implementation detail that doesn't matter for the vast majority of programming, much like how fields are packed in an object is almost completely irrelevant. Existing languages drive people to attend to these small details like collection choice that largely don't matter except in extreme circumstances (like game programming).
What it would have is something more like a Set<T ordered by T.X>, and maybe not even ordering should be specifiable as that's typically a detail of presentation/consumers of data. Restrictions are freeing, so the point is to eliminate many ill-advised premature optimizations and unnecessary internal details. Maybe the runtime will use one of those classic collections internally from the constraints you specify on the set, but the fundamental choice would not typically be visible.
> That all said, you could probably get at what you mean by just specifying indices instead of complexity and treating an embedded sqlite table as a native mutable collection type with methods to create/drop indices and join with other tables.
Yes, something like sqlite would likely be part of the runtime of such a language, and seems like the most straightforward way to prototype it. Anyway, I don't have a concrete semantics worked out so much as rough ideas of certain properties, and this is only one of them.