Show HN: Hydra - Open-Source Columnar Postgres

3 years ago (hydra.so)

hi hn, hydra ceo here

hydra is an open-source extension that adds columnar tables to Postgres for efficient analytical reporting. With Hydra, you can analyze billions of rows instantly without changing code.

demo video (5 min): https://youtu.be/1yzxgb0Oyrw github repo: https://github.com/hydradatabase/hydra

For 1.0 GA release, aggregate queries are over *60% faster* than Hydra beta due to aggregate vectorization. Spatial indexes (gin, gist, spgist, and rum indexes) and pg_hint_plan are now enabled for performance optimization.

postgres is great, but aggregates can take minutes to hours to return results on large data sets. long-running analytical queries hog database resources and degrade performance. use hydra to run much faster analytics on postgres without changing code.

for testing, try the hydra free tier to create a column postgres instance on the cloud. https://dashboard.hydra.so/signup

33 comments

coatue

mlenhard 3 years ago

Congrats on the 1.0 Release, big milestone.

I'm personally really excited about all of the recent tooling for postgres aggregates. Definitely a pain point for a lot of developers and its easy to fall in trap where things work fine in the beginning and then query times explode as requirements change and the dataset grows. Nice to not have to spin up another DB in order to solve the problem as well.

MuffinFlavored 3 years ago

> I'm personally really excited about all of the recent tooling for postgres aggregates. Definitely a pain point for a lot of developers
Could you give a few examples of what you are speaking of?

cjonas 3 years ago

What's the workflow for leveraging this extension in real-time for an existing database?

Say I wanted to use this to create a high performance "aggregation" API of my existing "write heavy" tables.

Is there a way to keep a `heap` & `columnar` table in sync?

(relative Postgres noob here)

gregwebs 3 years ago

You could use ETL tools like peerdb.io. This isn't "real time" but instead some refresh interval. ZomboDB uses ElasticSearch as an index that is transactionally consistent with Postgres. It gives hope that in the future we will see consistent columnar tables or indexes. SQL Server supports columnar indexes on non-columnar tables.
jerrysievert 3 years ago

there are a couple of ways to do it, and none of them that I'm able to think of are great - maybe some others will be able to answer better than I am, but ...
if the data is append-only, an insert trigger could work. if it gets updated and deleted, then insert, update, and delete triggers could be added. of course if the table is very active, this could get bad, fast.
alternately, you can do an insert every hour or so, like insert into table_columnar where created_at > DATE_TRUNC('hour', created_at)
or, even truncate the columnar table daily and re-insert all of the data.
likely none of these is the _best_ solution, but they could help you find what might be the best solution for you.
alternately, if the query patterns work well, you can simply convert the table to columnar, but that's not a panacea.

garysahota93 3 years ago

I've been using Hydra for the last ~2 months & genuinely love it. The team is really talented & it's so great to see the progress they've been making. Congrats on the 1.0 GA release! Huge step!

coatue 3 years ago

Thank you!

iepathos 3 years ago

Nice tool, only unfortunate name, consider changing it. Already very well know security tool named hydra https://github.com/vanhauser-thc/thc-hydra been around since 2001. Then facebook went ahead and named their config tool hydra https://github.com/facebookresearch/hydra on top of it. Like we get it, hydra popular mythology but we could use more original naming for tools

cultofmetatron 3 years ago
yea acropolis would be a better name given that its columns are famous
- coatue 3 years ago
  
  I was thinking X.com - is it available?
  
  1 reply →
robertlagrant 3 years ago

Let's hope Ory never uses it! Oh no[0].
[0] https://www.ory.sh/hydra
metadat 3 years ago

Yeah, everyone names their thing something generic like Atlas or Hydra. Choose to be daring and original instead! You won't regret it.
hamoid 3 years ago

Or https://github.com/hydra-synth/hydra (Livecoding networked visuals in the browser, since 2017)

efxhoy 3 years ago

Big congrats on 1.0! Super exciting project.

My dream scenario would be installing hydra as an extension into my main rails application database. My usecase is showing analytics numbers directly to users, like "how many people visited my listing", which regular row-level postgres is not suited to answer. To do this now we need a to get that data from our DW, which is slow for single queries, so we need a cache, which we need to keep in sync, which is complexity I don't want. It would be amazing if I could do user-facing analytics queries directly in my main app db.

What put me off after a quick scroll:

Installing the extension changes the default table type to be columnar. I don't want an installed extension to do that, my main workload is still row oriented oltp, I only want specific tables to be columnar and I don't want to change all my normal migrations to specify `USING heap`. IMO timescale does this really well, it's an extension, not a new database. At least that's how I would want it to be.

It also seems like you're trying to claim postgres foreign data wrappers as "hydra external tables", implying it's a new feature? Postgres does this (reading other databases and external files) out of the box and it feels sneaky to try and brand that.

Also the FAQ says "Hydra is not a fork." When the engine clearly is: https://github.com/hydradatabase/citus I realize you want to monetize this as a bigger platform and that's completely fair, but it strikes me as dishonest to deny the citus originins in the FAQ.

wuputah 3 years ago
Thanks for calling these out, as these are just misunderstandings. We will certainly tweak the language around these.
- Installing the extension itself does not change the default table type, this is only the case on Hydra Cloud and our Docker image.
- "Hydra is not a fork" refers to the fact that Hydra did not fork Postgres; it is an extension. We have put in a lot of effort since forking Citus, but it's not our intent to hide that fact.
- Yes, "Hydra External Tables" is a productization around FDWs, there's more we want to do with it but it hasn't been our focus lately.
- efxhoy 3 years ago
  
  > - Installing the extension itself does not change the default table type, this is only the case on Hydra Cloud and our Docker image.
  Ah cool, thanks! How would I go about adding the extension to my own "FROM postgres:15" Dockerfile?
jerrysievert 3 years ago

> Installing the extension changes the default table type to be columnar.
that is not the case, hydra as a service sets the default table type. the columnar extension does not make any changes like that, it simply ("simply") adds columnar as an option.
I'm just an engineer, so I'll leave the other comments for others :)

pella 3 years ago

Congratulations!

Please also add this info :

#1. to the pgsql-announce list: https://www.postgresql.org/search/?m=1&ln=pgsql-announce&q=h... "Your search for hydra returned no hits."

#2. to the https://planet.postgresql.org/

adultSwim 3 years ago

Watch out. There used to be another Hydra project, a data repository with rich linked metadata, that changed its name after legal threat over trademark from Hydra Corporation. Now it's called Hyku, https://hyku.samvera.org/

I hope you choose to defend your name.

coatue 3 years ago

Should have mentioned, if you want to chat about open source, analytics, or meet some of the Hydra team swing by our event in SF this Thursday: https://partiful.com/e/gowvDVdnNcBLKUzfGOPv

mdaniel 3 years ago

some previous discussions:

https://github.com/hydradatabase/hydra#license> since the GitHub sidebar is misleading

nitinreddy88 3 years ago

Congratulations. @coatue, it would be great if you can share your email to reach out for Licensing details. I did fill up your form in site, but never received any response

coatue 3 years ago

Nitin, I would be happy to- would you mind emailing me at J at hydra dot so. Let's chat!

I_am_tiberius 3 years ago

I have 2 questions.

1. Is this optimized for constantly adding and removing rows to the columnar table?

2. Is this supported by Microsoft Azure Flexible Server for Postgres?

gregwebs 3 years ago
> Whenever possible, design data coming into the data warehouse as append-only. Hydra's columnar store only supports inserts. If you need to update or delete data, you will need to use row (heap) tables.
- jerrysievert 3 years ago
  
  that is not the case at this point. updates, deletes, and vacuuming are all available.
  
  2 replies →
jerrysievert 3 years ago

I can answer number 1:
updates and deletes are available, as well as the ability to compact the table.

giovannibonetti 3 years ago

> For 1.0 GA release

You may want to check that box in the README, assuming it is already done.

coatue 3 years ago

Great catch - updating, please hold

winrid 3 years ago

How does sharding work? Can I use this with citus to scale horizontally?