Comment by lateforwork
5 days ago
There are two broad types of databases: operational and analytical.
Operational databases store transactions and support day-to-day application workflows.
For analysis, data is often copied into separate analytical databases (data warehouses), which are structured for efficient querying and large-scale data processing. These systems are designed to handle complex, random queries and heavy workloads.
LLM agents are the best way to analyze data stored in these databases. This is the future.
> LLM agents are the best way to analyze data stored in these databases
Why, and how?
> Why
Based on my experience with Claude, it's pretty damn good at doing data analysis, if given the right curated data models. You still need to eyeball the generated SQL to make sure it makes sense.
> and how?
1. Replicate your Postgres into Snowflake/Databricks/ClickHouse/etc, or directly to Iceberg and hook it up to Snowflake/Databricks/ClickHouse/etc.
2. Give your agent read access to query it.
3. Build dimensional models (facts and dimensions tables) from the raw data. You can ask LLM for help here, Claude is pretty good at designing data models in my experience.
4. Start asking your agent questions about your data.
Keep steps 3-4 as a tight feedback loop. Every time your agent hallucinates or struggle to answer your questions, improve the model.
Side note: I'm currently building a platform that does all 3 (though you still need to do 2 yourself), you just need Postgres + 1 command to set it up: https://polynya.dev/
> Claude is pretty good at designing data models in my experience
Yesterday, Claude decided to go with nvarchar(100) for an IP address column instead of varbinary(16), and thinks RBAR triggers are just-as-good as temporal tables.
So, no. Claude is not good at designing data models in my experience.
1 reply →
> Side note: I'm currently building a platform
Oh ok this comment is just an ad then
Wide tables and rich data. Dozens to hundreds of columns, some of them a json dimension. Way easier to explore these datasets with AI