← Back to context

Comment by vjerancrnjak

24 days ago

These things cost money, Redshift handling live ingestion from Kinesis is tricky.

There is no need for Athena, Redshift ingestion is a simple query that reads from S3. I dont want to copy 10TB of data just to have it in 1 file. And yes, default storage is a bit better than S3 but for an OLAP database there seems to be no proper column compression and data footprint is too big resulting in slow reads if one is not careful.

I mentioned clickhouse, data is obviously not OLTP schemed.

I don’t have normalized data. As I mentioned, Clickhouse consumer goes through 10TB of blobs and ends up having 15GB of postprocessed data in like 5-10 minutes, slowest part is downloading from S3.

I am not willing to pay 10k+ a month for something that absolutely sucks compared to a proper OLAP db.

Redshift is just made for some very specific, bloated, throw as much software pipelines as you can, pay as much money as you can, workflows that I just don’t find valuable. Its compute engine and data repr is just laughably slow, yeah, it can be as fast as you want by throwing parallel units but it’s a complete waste of money.

It seems like you want a time series database not an OLAP. Every problem you described you would also have with Snowflake or another OLAP database

  • Thanks for having this discussion with me. I believe I don't want a time series database. I want to be able to invent new queries and throw them at a schema, or create materialized views to have better queries etc. I just don't find Snowflake or Redshift anywhere close to what they're selling.

    I think these systems are optimized for something else, probably organizational scale, predictable low value workloads, large teams that just throw their shit at it and it works on a daily basis, and of course, it costs a lot.

    My experience after renting a $1k EC2 instance and slurping all of S3 onto it in a few hours, and Redshift being unable to do the same, made me not consider these systems reliable for anything other than ritualistic performative low value work.

    • I’ve told you my background. I’m telling you that you are using the wrong tool for the job. It’s not an issue with the database. Even if you did need an OLAP database like Reddhift, you are still treating it like an OLTP database as far as your ETL job. You really need to do some additional research

      5 replies →