Comment by feverzsj

19 days ago

Yes. And it should be faster. They may forget to create spatial index.

6 comments

feverzsj

Agree with this. They are re-solving a problem that has been solved better by others before (with R-trees).

They may well be using some data storage where spatial indexing is not possible or standard. Geoparquet is a common one now - a great format in many ways but spatial indexing isnt there.

Postgres may be out of fashion but still an old fashioned postgis server is the simplest solution sometimes.

pb060 19 days ago
Why do you consider Postgres + PostGIS out of fashion? What are people using for spatial data these days?
- twelvechairs 19 days ago
  
  For use cases like this - long term geospatial people still use postgis as foundational - mainly for its speed at scale and spatial indexing.
  For the wider tech world - I would say postgres suffers from being "old tech" and somewhat "monolithic". There have been a lot of trends against it (e.g. nosql, fleeing the monolith, data lakes). But also more practically for a lot of businesses geospatial is not their primary focus - they bring other tech stacks so something like postgis can seem like duplication if they already use another database, data storage format or data processing pipeline. Also some of the proliferation of other software and file formats have made some uses cases easier without postgis.
  Really Id say the most common path ive seen for people who dont have an explicit geospatial background who are starting to implement it is to avoid postgis until it becomes absolutely clear that they need it.
  
  1 reply →
rockinghigh 19 days ago
I wouldn't say R-trees solve the problem better. Joining multiple spatial dataset indexed with r-trees is more complex as the nodes are dynamic and data dependent. Neighborhood search is also more complicated because parent nodes overlap.
- twelvechairs 19 days ago
  
  Its a well researched area. My understanding is for most use cases and data like this R trees outperform as bounding box comparisons are fast to run and the bounding boxes tend to be well organised to chunk data efficiently. H3 is a looser area and you may find lots of your points are clustered in a few grids so you end up doing more expensive detailed intersection calculations. Of course it all depends a little on your data, use case and to some extent the parameters chosen for the spatial index. But I think safe to say now based on industry experience that r trees do a very good job 99.9% of the time.
  You can of course also use h3 in postgis directly as well as r trees. Its helps significantly for heatmap creation and sometimes for neighbourhood searches.