Comment by egruy
10 days ago
I am building Tracelake - a data quality solution for SAP replications.
Why? SAP holds the most important data for companies that use it, but it's notoriously difficult to replicate this data consistently into a data analytics platform (think Snowflake, Redshift, etc...).
Couple of companies specialize in the SAP replication, but it's hard to validate the correctness of the replicated data, because:
- the SAP data is changing continuously and rapidly
- there are hundreds of tables and TBs of data
Usually it's the consumers of data downstream who notice that the data just "doesn't feel right".
Tracelake adds a validation layer on top of the SAP to X replication, which periodically compares the data between source and target and informs you about any missing / incorrect data, so you can tackle data quality issues proactively.
No comments yet
Contribute on Hacker News ↗